PROBLEM STATEMENTΒΆ

  • DOMAIN: Food Industry.
  • CONTEXT: Computer Vision (CV) technologies have revolutionized automation in various industries, including the food industry. In this project, we explore the use of deep learning-based computer vision models to automatically identify food items from images. Such identification can be used to trigger appropriate actions or alerts based on the type, color, or presence of specific food ingredients. Automating the recognition of food items can be especially useful in environments like kitchens, restaurants, food-delivery platforms, or dietary tracking applications.

β€’ DATA DESCRIPTION:

  • We use the Food101 dataset, which contains a total of 16,256 images spanning across 17 classes of food items. Each food category (e.g., apple_pie, sushi, hamburger) includes a diverse range of image samples with natural variations in lighting, background, and presentation.

For the scope of Milestone 1:ΒΆ

  • We have selected 10 food classes from the dataset.
  • For each selected class, 50 images have been extracted for training and annotation purposes.
  • The dataset has been split into training and testing sets, maintaining approximately a 70:30 ratio.

These images were manually annotated using a standard image annotation tool to identify and localize the food object within each image. These annotations are used for displaying bounding boxes and training object detection models in the subsequent milestones.

PROJECT OBJECTIVE: Design a DL based Food identification modelΒΆ

Importing the librariesΒΆ

InΒ [Β ]:
# Importing the libraries
# Standard libraries
import os
import random
import json
import re
from collections import defaultdict

# File handling and image processing
import zipfile
from PIL import Image
import cv2

# Data manipulation and visualization
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from IPython.display import clear_output
%matplotlib inline

# Machine learning and deep learning
from sklearn.model_selection import train_test_split
import torch
import tensorflow as tf
InΒ [Β ]:
#Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

#Logic to accesss shared folder in Google Drive using shortcuts

#Set Project base path
project_path = '/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/'
#project_path = '/content/drive/MyDrive/shortcuts/Capstone/'
Mounted at /content/drive

Extract Image DataΒΆ

InΒ [Β ]:
# Define the path to the zip file and the extraction directory
zip_file_path = project_path + 'Food_101.zip'
food_dir = project_path + 'Food_101'

# Extract the zip file if the directory doesn't exist
if not os.path.exists(food_dir):
    print(f"Extracting '{zip_file_path}' to '{food_dir}'...")
    #with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    #    zip_ref.extractall(project_path)
    print("Extraction complete.")
else:
    print(f"Directory '{food_dir}' already exists. Skipping extraction.")
Directory '/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/Food_101' already exists. Skipping extraction.
InΒ [Β ]:
# Function to read images and extract labels
def load_food_data(food_dir):
    image_paths = []
    labels = []
    for folder_name in os.listdir(food_dir):
      folder_path = os.path.join(food_dir, folder_name)
      if os.path.isdir(folder_path):
          for image_file in os.listdir(folder_path):
            if image_file.lower().endswith(('.png', '.jpg', '.jpeg')): # Check for image extensions
                image_path = os.path.join(folder_path, image_file)
                try:
                    image_path = os.path.join(folder_path, image_file)
                    image_paths.append(image_path)
                    labels.append(folder_name)  # Use folder name as label
                except Exception as e:
                    print(f"Error processing image {image_path}: {e}")
    return image_paths, labels

image_paths, labels = load_food_data(food_dir)

# Check number of loaded images and labels
print(f"Number of images loaded : {len(image_paths)}")
print(f"Number of labels loaded : {len(labels)}")
Number of images loaded : 16253
Number of labels loaded : 16253

Exploratory Data Analysis (EDA)ΒΆ

Visualizing Sample ImagesΒΆ

InΒ [Β ]:
# Display random images with their labels
def display_random_images(image_paths, labels, num_images=5):
    """
    Displays a random selection of images with their corresponding labels.

    Args:
        image_paths (list): List of image file paths.
        labels (list): List of labels corresponding to the images.
        num_images (int): Number of random images to display.
    """
    if not image_paths or not labels:
        print("No images or labels to display.")
        return

    # Randomly select indices
    random_indices = random.sample(range(len(image_paths)), min(num_images, len(image_paths)))

    plt.figure(figsize=(15, 8))
    for i, idx in enumerate(random_indices):
        try:
            img = cv2.imread(image_paths[idx])
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB for matplotlib
            plt.subplot(1, num_images, i + 1)
            plt.imshow(img)
            plt.title(labels[idx])
            plt.axis('off')
        except Exception as e:
            print(f"Error loading or displaying image {image_paths[idx]}: {e}")
    plt.tight_layout()
    plt.show()
InΒ [Β ]:
# Display 5 random images
display_random_images(image_paths, labels, num_images=5)
No description has been provided for this image

ObservationsΒΆ

  • The images in the dataset have varying sizes and aspect ratios, as observed. To ensure compatibility with the model, we need to resize the images to a consistent size before model building

Identify ClassesΒΆ

InΒ [Β ]:
# Automatically extract food classes by listing subdirectories
try:
    all_classes = sorted([d for d in os.listdir(food_dir) if os.path.isdir(os.path.join(food_dir, d))])
    print(f"Found the following food classes: {all_classes}")
except FileNotFoundError:
    print(f"Error: Food directory not found at {food_dir}. Please ensure the path is correct.")
    exit()

selected_classes = all_classes  # Use all found classes
Found the following food classes: ['apple_pie', 'chocolate_cake', 'donuts', 'falafel', 'french_fries', 'hot_dog', 'ice_cream', 'nachos', 'onion_rings', 'pancakes', 'pizza', 'ravioli', 'samosa', 'spring_rolls', 'strawberry_shortcake', 'tacos', 'waffles']

Understanding Dataset CompositionΒΆ

InΒ [Β ]:
import os
from collections import Counter
import matplotlib.pyplot as plt

def analyze_images(image_paths, labels):
    # Count image formats based on file extensions
    formats = Counter()
    for image_path in image_paths:
        ext = os.path.splitext(image_path)[1].lower()  # Get file extension
        formats[ext] += 1

    # Count number of images per class
    class_counts = Counter(labels)

    return formats, class_counts

# Analyze the data
formats, class_counts = analyze_images(image_paths, labels)

# Visualize Image Formats
def visualize_image_formats(formats):
    plt.figure(figsize=(8, 5))
    bars = plt.bar(formats.keys(), formats.values(), color='skyblue')
    plt.title("Image Formats Distribution")
    plt.xlabel("Image Format")
    plt.ylabel("Number of Images")
    plt.xticks(rotation=45)

    # Add annotations (numbers) on top of the bars
    for bar in bars:
        height = bar.get_height()
        plt.text(bar.get_x() + bar.get_width() / 2, height, str(height), ha='center', va='bottom')

    plt.show()

# Visualize Number of Images per Class
def visualize_class_distribution(class_counts):
    plt.figure(figsize=(10, 6))
    bars = plt.bar(class_counts.keys(), class_counts.values(), color='lightgreen')
    plt.title("Number of Images per Class")
    plt.xlabel("Class")
    plt.ylabel("Number of Images")
    plt.xticks(rotation=45)

    # Add annotations (numbers) on top of the bars
    for bar in bars:
        height = bar.get_height()
        plt.text(bar.get_x() + bar.get_width() / 2, height, str(height), ha='center', va='bottom')

    plt.show()

# Call visualization functions
visualize_image_formats(formats)
visualize_class_distribution(class_counts)
No description has been provided for this image
No description has been provided for this image

ObservationsΒΆ

  • All Images are of type JPG
  • Apple pie has only 256 images, ice_cream has 997 while all other classes have 1000 images - there is a class imbalance here.

Load training and test annotationsΒΆ

InΒ [Β ]:
# Function to extract bounding box data from COCO annotations
def extract_bounding_box_data(annotation_file_path):
    """
    Extracts bounding box data from a COCO JSON annotation file.

    Args:
        annotation_file_path (str): Path to the COCO JSON annotation file.

    Returns:
        dict: A dictionary where keys are image file names and values are lists of bounding box data.
              Each bounding box data is a dictionary with 'class_name' and 'bbox' (x, y, width, height).
    """
    try:
        with open(annotation_file_path, 'r') as f:
            coco_data = json.load(f)
    except FileNotFoundError:
        print(f"Error: File not found at {annotation_file_path}")
        return {}

    # Create a mapping of category_id to class names
    category_id_to_name = {category['id']: category['name'] for category in coco_data['categories']}

    # Create a mapping of image_id to file names
    image_id_to_file_name = {image['id']: image['file_name'] for image in coco_data['images']}

    # Extract bounding box data
    bounding_box_data = {}
    for annotation in coco_data['annotations']:
        image_id = annotation['image_id']
        category_id = annotation['category_id']
        bbox = annotation['bbox']  # [x, y, width, height]

        # Map category_id and image_id to their respective names
        class_name = category_id_to_name.get(category_id)
        image_file_name = image_id_to_file_name.get(image_id)
        # Update the filename to remove the ".rf.<hash>" part
        image_file_name = re.sub(r'_.*', '.jpg', image_file_name)

        if class_name and image_file_name:
            if image_file_name not in bounding_box_data:
                bounding_box_data[image_file_name] = []
            bounding_box_data[image_file_name].append({
                'class_name': class_name,
                'bbox': bbox
            })

    return bounding_box_data
InΒ [Β ]:
# Get training Bounding box data
train_annotation_file = project_path + 'fooddetection-cap-cv3-may24b.v7i.coco/train/_annotations.coco.json'
train_bounding_box_data = extract_bounding_box_data(train_annotation_file)

# Get testing Bounding box data
test_annotation_file = project_path + 'fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json'
test_bounding_box_data = extract_bounding_box_data(test_annotation_file)

Display images with label and bounding boxΒΆ

InΒ [Β ]:
# Get annotated training and test data image paths
annotated_training_image_paths = []
annotated_testing_image_paths = []
for image_path in image_paths:
    image_file_name = os.path.basename(image_path)
    # Check if the image has bounding box data
    if image_file_name in train_bounding_box_data:
        annotated_training_image_paths.append(image_path)
    # Check if the image has bounding box data
    if image_file_name in test_bounding_box_data:
        annotated_testing_image_paths.append(image_path)
InΒ [Β ]:
# Function to display 5 images with bounding boxes in a subplot
def display_sample_images_with_bounding_boxes(image_paths, bounding_box_data, num_images=5):
    """
    Displays a random selection of images with bounding boxes in a subplot.

    Args:
        image_paths (list): List of image file paths.
        bounding_box_data (dict): Dictionary where keys are image file names and values are lists of bounding box data.
        num_images (int): Number of images to display in the subplot.
    """
    # Filter image paths that have bounding box data
    annotated_image_paths = [path for path in image_paths if os.path.basename(path) in bounding_box_data]

    # Randomly select a subset of images
    selected_image_paths = random.sample(annotated_image_paths, min(num_images, len(annotated_image_paths)))

    # Create a subplot
    plt.figure(figsize=(15, 10))
    for i, image_path in enumerate(selected_image_paths):
        image_file_name = os.path.basename(image_path)
        bounding_boxes = bounding_box_data[image_file_name]

        # Open the image
        image = Image.open(image_path)

        # Add the image to the subplot
        ax = plt.subplot(1, num_images, i + 1)
        ax.imshow(image)
        ax.axis('off')

        # Add bounding boxes to the image
        for box in bounding_boxes:
            class_name = box['class_name']
            x, y, width, height = box['bbox']

            # Create a rectangle patch
            rect = patches.Rectangle((x, y), width, height, linewidth=2, edgecolor='red', facecolor='none')
            ax.add_patch(rect)

        # Add the label (class name) as the title of the subplot
        ax.set_title(bounding_boxes[0]['class_name'], fontsize=12, color='blue')

    plt.tight_layout()
    plt.show()
Display training images with bounding box and labelΒΆ
InΒ [Β ]:
# Display random training images with bounding box
display_sample_images_with_bounding_boxes(annotated_training_image_paths, train_bounding_box_data, num_images=5)
No description has been provided for this image
Display test images with bounding box and labelΒΆ
InΒ [Β ]:
# Display random test images with bounding box
display_sample_images_with_bounding_boxes(annotated_testing_image_paths, test_bounding_box_data, num_images=5)
No description has been provided for this image

Model BuildingΒΆ

Generate Test and Train DataΒΆ

InΒ [Β ]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.applications.efficientnet import preprocess_input
from tensorflow.keras.callbacks import ModelCheckpoint
import os

# Parameters
IMAGE_SIZE = (224, 224)
BATCH_SIZE = 32
NUM_CLASSES = 17
DATA_DIR = food_dir  # Dataset folder with subfolders per class

# Directory to save model weights
checkpoint_dir = os.path.join(project_path, "EfficientNetB0-checkpoints")
os.makedirs(checkpoint_dir, exist_ok=True)

# Path to save the model weights
checkpoint_path = os.path.join(checkpoint_dir, "model_epoch_{epoch:02d}.weights.h5")

# Data augmentation and preprocessing
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    rotation_range=20,  # Rotate images up to 20 degrees
    width_shift_range=0.2,  # Shift images horizontally by 20% of the width
    height_shift_range=0.2,  # Shift images vertically by 20% of the height
    shear_range=0.2,  # Apply shear transformations with a 20% intensity
    horizontal_flip=True,  # Flip images horizontally
    validation_split=0.3  # 20% validation split
)

# Create a training data generator
train_generator = train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE, # Resize images to the specified size
    batch_size=BATCH_SIZE, # Number of images to process in a batch
    class_mode='categorical',
    subset='training',
    shuffle=True # Shuffle the training data for better generalization
)

# Create a validation data generator
validation_generator = train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE, # Resize images to the specified size
    batch_size=BATCH_SIZE, # Number of images to process in a batch
    class_mode='categorical',
    subset='validation',
    shuffle=False # Do not shuffle validation data to maintain order
)
Found 11378 images belonging to 17 classes.
Found 4875 images belonging to 17 classes.
  • Using EfficientNet as the base model
InΒ [Β ]:
# Load EfficientNetB0 base model without top layer
base_model = EfficientNetB0(include_top=False, input_shape=IMAGE_SIZE + (3,), weights='imagenet')
# Freeze base model
base_model.trainable = False

# Build the classification head on top of the base model
# Define the input layer with the same shape as the input images
inputs = layers.Input(shape=IMAGE_SIZE + (3,))
# Pass the inputs through the base model (EfficientNetB0) without updating its weights
x = base_model(inputs, training=False)
# Apply global average pooling to reduce the spatial dimensions
x = layers.GlobalAveragePooling2D()(x)
# Add a dropout layer to reduce overfitting
x = layers.Dropout(0.3)(x)
# Add the output layer with the number of classes and softmax activation for classification
outputs = layers.Dense(NUM_CLASSES, activation='softmax')(x)
model = models.Model(inputs, outputs)

# Compile the model with the Adam optimizer, categorical crossentropy loss, and accuracy metric
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)
  • Train the model and resume training from last run run in case training was stopped in between
InΒ [Β ]:
# Check if there are any saved weights to resume training
latest_checkpoint = None
epochs_completed = 0  # Default to 0 if no checkpoint is found

if os.path.exists(checkpoint_dir) and os.listdir(checkpoint_dir):
    # Find the latest checkpoint file
    checkpoint_files = [
        f for f in os.listdir(checkpoint_dir) if f.endswith(".h5")
    ]
    if checkpoint_files:
        latest_checkpoint = max(
            [os.path.join(checkpoint_dir, f) for f in checkpoint_files],
            key=os.path.getctime
        )
        # Extract the epoch number from the checkpoint filename
        import re
        match = re.search(r"epoch_(\d+)", latest_checkpoint)
        if match:
            epochs_completed = int(match.group(1))
        print(f"Resuming training from epoch {epochs_completed}. Loading weights from: {latest_checkpoint}")
        model.load_weights(latest_checkpoint)
    else:
        print("No checkpoint found. Starting training from scratch.")
else:
    print("No checkpoint directory found. Starting training from scratch.")

# Callback to save model weights at every epoch
checkpoint_path = os.path.join(checkpoint_dir, "model_epoch_{epoch:02d}.weights.h5")
checkpoint_callback = ModelCheckpoint(
    filepath=checkpoint_path,
    save_weights_only=True,  # Save only weights
    save_best_only=False,    # Save weights at every epoch
    verbose=1
)

# Train model (feature extraction)
total_epochs = 10  # Total epochs you want to train
remaining_epochs = total_epochs - epochs_completed

if remaining_epochs > 0:
    print(f"Training with remaining epochs: {remaining_epochs}")
    # Train the model starting from the last completed epoch and save weights at each epoch
    history = model.fit(
        train_generator,
        validation_data=validation_generator,
        epochs=epochs_completed + remaining_epochs,  # Total epochs to train
        initial_epoch=epochs_completed,  # Start from the last completed epoch
        callbacks=[checkpoint_callback], # Save model weights at each epoch
        verbose=1  # Ensure logs are displayed
    )
else:
    print("Training is already complete. No remaining epochs to train.")
Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb0_notop.h5
16705208/16705208 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
No checkpoint found. Starting training from scratch.
/usr/local/lib/python3.11/dist-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
Epoch 1/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 11s/step - accuracy: 0.5188 - loss: 1.7030 
Epoch 1: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_01.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 5385s 15s/step - accuracy: 0.5192 - loss: 1.7018 - val_accuracy: 0.7689 - val_loss: 0.8104
Epoch 2/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 553ms/step - accuracy: 0.7511 - loss: 0.8361
Epoch 2: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_02.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 280s 787ms/step - accuracy: 0.7511 - loss: 0.8361 - val_accuracy: 0.7853 - val_loss: 0.7186
Epoch 3/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 550ms/step - accuracy: 0.7814 - loss: 0.7189
Epoch 3: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_03.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 279s 783ms/step - accuracy: 0.7814 - loss: 0.7189 - val_accuracy: 0.7931 - val_loss: 0.6671
Epoch 4/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 542ms/step - accuracy: 0.7938 - loss: 0.6769
Epoch 4: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_04.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 274s 771ms/step - accuracy: 0.7938 - loss: 0.6769 - val_accuracy: 0.8048 - val_loss: 0.6478
Epoch 5/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 530ms/step - accuracy: 0.8102 - loss: 0.6263
Epoch 5: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_05.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 268s 754ms/step - accuracy: 0.8102 - loss: 0.6263 - val_accuracy: 0.8187 - val_loss: 0.6269
Epoch 6/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 522ms/step - accuracy: 0.8127 - loss: 0.6109
Epoch 6: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_06.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 268s 754ms/step - accuracy: 0.8127 - loss: 0.6109 - val_accuracy: 0.8146 - val_loss: 0.6151
Epoch 7/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 521ms/step - accuracy: 0.8128 - loss: 0.6025
Epoch 7: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_07.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 265s 745ms/step - accuracy: 0.8128 - loss: 0.6025 - val_accuracy: 0.8101 - val_loss: 0.6068
Epoch 8/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 519ms/step - accuracy: 0.8164 - loss: 0.5848
Epoch 8: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_08.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 321s 742ms/step - accuracy: 0.8164 - loss: 0.5848 - val_accuracy: 0.8171 - val_loss: 0.6063
Epoch 9/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 514ms/step - accuracy: 0.8173 - loss: 0.5811
Epoch 9: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_09.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 263s 739ms/step - accuracy: 0.8173 - loss: 0.5811 - val_accuracy: 0.8171 - val_loss: 0.6078
Epoch 10/10
356/356 ━━━━━━━━━━━━━━━━━━━━ 0s 532ms/step - accuracy: 0.8263 - loss: 0.5399
Epoch 10: saving model to /content/drive/MyDrive/shortcuts/Capstone/checkpoints/model_epoch_10.weights.h5
356/356 ━━━━━━━━━━━━━━━━━━━━ 269s 755ms/step - accuracy: 0.8263 - loss: 0.5399 - val_accuracy: 0.8160 - val_loss: 0.6067

Plot Training HistoryΒΆ

InΒ [Β ]:
import os
import json
import matplotlib.pyplot as plt

# Path to save/load training history
history_path = os.path.join(checkpoint_dir, "training_history.json")

print(history_path)

# Load existing training history if it exists
if os.path.exists(history_path):
    print(f"Loading existing training history from: {history_path}")
    with open(history_path, "r") as f:
        history_data = json.load(f)
else:
    print("No existing training history found. Initializing new history.")
    history_data = {"accuracy": [], "val_accuracy": [], "loss": [], "val_loss": []}

# Update history with the current training results if `history` exists
if 'history' in globals() and hasattr(history, 'history'):
    print("Updating training history with new results.")
    for key, values in history.history.items():
        if key in history_data:
            history_data[key].extend(values)
        else:
            history_data[key] = values
else:
    print("No new training history found. Using existing history.")

# Save updated training history to a JSON file
with open(history_path, "w") as f:
    json.dump(history_data, f)

# Plot training history
plt.figure(figsize=(12, 6))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history_data["accuracy"], label="Training Accuracy")
plt.plot(history_data["val_accuracy"], label="Validation Accuracy")
plt.title("Model Accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history_data["loss"], label="Training Loss")
plt.plot(history_data["val_loss"], label="Validation Loss")
plt.title("Model Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()

# Save the plot
plot_path = os.path.join(checkpoint_dir, "training_history.png")
plt.savefig(plot_path)
plt.show()
/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-checkpoints/training_history.json
Loading existing training history from: /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-checkpoints/training_history.json
No new training history found. Using existing history.
No description has been provided for this image

Model Performace using the best weightsΒΆ

InΒ [Β ]:
# Path to the weights saved in epoch 10
best_model_weights_path = os.path.join(checkpoint_dir, "model_epoch_10.weights.h5")

# Load the weights into the model
if os.path.exists(best_model_weights_path):
    print(f"Loading weights from: {best_model_weights_path}")
    model.load_weights(best_model_weights_path)
else:
    raise FileNotFoundError(f"Weights file not found at: {best_model_weights_path}")

# Evaluate the model on the validation dataset
val_loss, val_accuracy = model.evaluate(validation_generator)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")

# Use the model for predictions
# Example: Predict on a batch of images from the validation set
predictions = model.predict(validation_generator)
print(f"Predictions: {predictions}")
Loading weights from: /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-checkpoints/model_epoch_10.weights.h5
/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py:757: UserWarning: Skipping variable loading for optimizer 'adam', because it has 2 variables whereas the saved optimizer has 6 variables. 
  saveable.load_own_variables(weights_store.get(inner_path))
/usr/local/lib/python3.11/dist-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
153/153 ━━━━━━━━━━━━━━━━━━━━ 3582s 23s/step - accuracy: 0.7947 - loss: 0.6788
Validation Loss: 0.6096217632293701
Validation Accuracy: 0.8137435913085938
153/153 ━━━━━━━━━━━━━━━━━━━━ 91s 547ms/step
Predictions: [[4.5710128e-02 2.9984266e-03 6.2164436e-03 ... 8.3082145e-01
  2.1735409e-03 1.8250329e-02]
 [1.7405635e-02 1.3081112e-04 7.0863828e-02 ... 3.8991770e-04
  7.2069536e-03 2.1187686e-03]
 [6.3973136e-02 3.7134211e-03 2.1573217e-03 ... 1.3366306e-01
  6.2506380e-03 1.0985862e-01]
 ...
 [6.7677051e-03 5.0796423e-04 5.4097196e-05 ... 2.8181546e-05
  6.5848610e-07 9.8912829e-01]
 [6.0737634e-01 2.1629968e-04 9.1301580e-04 ... 5.9212081e-04
  9.7541713e-05 6.5970235e-02]
 [1.3818774e-02 3.6607740e-05 4.8196849e-05 ... 3.0777765e-06
  3.7474943e-05 6.3863069e-01]]

Classfication report and Confusion MatrixΒΆ

InΒ [Β ]:
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_score

def model_performance_classification_keras(
    true_labels, predicted_probs, class_labels, threshold=0.5
):
    """
    Function to compute different metrics to check classification model performance for multi-class problems.

    true_labels: Ground truth labels
    predicted_probs: Predicted probabilities from the model
    class_labels: List of class names
    threshold: Threshold for classifying the observation (used for binary classification)
    """
    # Get predicted classes
    predicted_labels = np.argmax(predicted_probs, axis=1)

    # Calculate overall metrics
    acc = accuracy_score(true_labels, predicted_labels)  # Overall Accuracy

    # Calculate metrics for each class
    metrics = []
    for i, class_name in enumerate(class_labels):
        class_true = (true_labels == i).astype(int)
        class_pred = (predicted_labels == i).astype(int)

        recall = recall_score(class_true, class_pred, zero_division=0)
        precision = precision_score(class_true, class_pred, zero_division=0)
        f1 = f1_score(class_true, class_pred, zero_division=0)

        metrics.append(
            {"Class": class_name, "Recall": recall, "Precision": precision, "F1": f1}
        )

    # Create a summary DataFrame
    df_perf = pd.DataFrame(metrics)
    df_perf.loc["Overall"] = {
        "Class": "Overall",
        "Recall": df_perf["Recall"].mean(),
        "Precision": df_perf["Precision"].mean(),
        "F1": df_perf["F1"].mean(),
    }

    return acc, df_perf
InΒ [Β ]:
from sklearn.metrics import classification_report, confusion_matrix
import pandas as pd

# Get true labels from the validation generator
true_labels = validation_generator.classes
class_labels = list(validation_generator.class_indices.keys())  # Class names

# Get predicted labels
predicted_labels = np.argmax(predictions, axis=1)

# Call the function
accuracy_initial, performance_df_initial = model_performance_classification_keras(
    true_labels=true_labels,
    predicted_probs=predictions,
    class_labels=class_labels
)

# Print overall accuracy
print(f"Overall Accuracy: {accuracy_initial:.2f}")

# Print class-wise performance metrics
print("Class-wise Performance Metrics:")
print(performance_df_initial)
Overall Accuracy: 0.81
Class-wise Performance Metrics:
                        Class    Recall  Precision        F1
0                   apple_pie  0.394737   0.545455  0.458015
1              chocolate_cake  0.846667   0.803797  0.824675
2                      donuts  0.806667   0.880000  0.841739
3                     falafel  0.746667   0.785965  0.765812
4                french_fries  0.940000   0.886792  0.912621
5                     hot_dog  0.850000   0.804416  0.826580
6                   ice_cream  0.755853   0.810036  0.782007
7                      nachos  0.813333   0.757764  0.784566
8                 onion_rings  0.896667   0.914966  0.905724
9                    pancakes  0.810000   0.791531  0.800659
10                      pizza  0.870000   0.906250  0.887755
11                    ravioli  0.853333   0.771084  0.810127
12                     samosa  0.843333   0.766667  0.803175
13               spring_rolls  0.773333   0.892308  0.828571
14       strawberry_shortcake  0.870000   0.765396  0.814353
15                      tacos  0.703333   0.770073  0.735192
16                    waffles  0.750000   0.797872  0.773196
Overall               Overall  0.795525   0.802963  0.797339
InΒ [Β ]:
# Generate confusion matrix
conf_matrix = confusion_matrix(true_labels, predicted_labels)
# Plot confusion matrix
plt.figure(figsize=(12, 10))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
            xticklabels=class_labels,
            yticklabels=class_labels)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()
No description has been provided for this image
  • onion_rings have the best precision and F1 followed by pizza and spring_rolls

  • Since Apple pie have less no. of images i.e. 76 and other class have 300 images. We can observe metrics are lower for this class. To improve the model accuracy further we can assign higher weights to the underrepresented class during training to penalize misclassifications or oversample this class

Model with balanced class weightsΒΆ

  • Running model with class_weight='balanced'
InΒ [Β ]:
from sklearn.utils.class_weight import compute_class_weight
import os
import re
import numpy as np
from tensorflow.keras.callbacks import ModelCheckpoint

# Compute class weights
class_weights = compute_class_weight(
    class_weight='balanced',
    classes=np.unique(true_labels),
    y=true_labels
)
class_weights = dict(enumerate(class_weights))

# Find the last saved weights file
checkpoint_files = [
    f for f in os.listdir(checkpoint_dir) if re.match(r"balanced_model_epoch_\d+\.weights\.h5", f)
]
if checkpoint_files:
    # Extract the epoch numbers from the filenames
    epochs_completed = max(
        int(re.search(r"balanced_model_epoch_(\d+)\.weights\.h5", f).group(1))
        for f in checkpoint_files
    )
    last_checkpoint_path = os.path.join(checkpoint_dir, f"balanced_model_epoch_{epochs_completed:02d}.weights.h5")
    print(f"Resuming training from epoch {epochs_completed}. Loading weights from: {last_checkpoint_path}")
    model.load_weights(last_checkpoint_path)
else:
    print("No checkpoint found. Starting training from scratch.")
    epochs_completed = 0

# Path to save the model weights for fine-tuning
checkpoint_tuning1_path = os.path.join(checkpoint_dir, "balanced_model_epoch_{epoch:02d}.weights.h5")

# Callback to save model weights at every epoch
checkpoint_tuning1_callback = ModelCheckpoint(
    filepath=checkpoint_tuning1_path,
    save_weights_only=True,  # Save only weights
    save_best_only=False,    # Save weights at every epoch
    verbose=1
)

# Resume training with class weights
total_epochs = 7  # Total epochs you want to train
remaining_epochs = total_epochs - epochs_completed
if remaining_epochs > 0:
    print(f"Training with class weights. Remaining epochs: {remaining_epochs}")
    history_fine_tuned = model.fit(
        train_generator,
        validation_data=validation_generator,
        epochs=epochs_completed + remaining_epochs,  # Total epochs to train
        initial_epoch=epochs_completed,  # Start from the last completed epoch
        class_weight=class_weights,  # Apply class weights
        callbacks=[checkpoint_tuning1_callback],
        verbose=1  # Ensure logs are displayed
    )

    print("Training with class weights completed.")

    # Save the updated training history
    history_path = os.path.join(checkpoint_dir, "balanced_model_training_history.json")
    if os.path.exists(history_path):
        # Load previous history if it exists
        with open(history_path, "r") as f:
            previous_history = json.load(f)
        # Merge old and new histories
        for key, values in history_fine_tuned.history.items():
            if key in previous_history:
                previous_history[key].extend(values)
            else:
                previous_history[key] = values
        # Save the merged history
        with open(history_path, "w") as f:
            json.dump(previous_history, f)
    else:
        # Save new history if no previous history exists
        with open(history_path, "w") as f:
            json.dump(history_fine_tuned.history, f)
else:
    print("Training is already complete. No remaining epochs to train.")
Resuming training from epoch 7. Loading weights from: /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-checkpoints/balanced_model_epoch_07.weights.h5
Training is already complete. No remaining epochs to train.
InΒ [Β ]:
# Evaluate the model on the validation dataset
val_loss, val_accuracy = model.evaluate(validation_generator)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")

# Use the model for predictions
predictions = model.predict(validation_generator)
print(f"Predictions: {predictions}")
153/153 ━━━━━━━━━━━━━━━━━━━━ 80s 523ms/step - accuracy: 0.8102 - loss: 0.6262
Validation Loss: 0.593762993812561
Validation Accuracy: 0.8215384483337402
153/153 ━━━━━━━━━━━━━━━━━━━━ 78s 512ms/step
Predictions: [[1.23528643e-02 1.11263373e-03 1.59139128e-03 ... 9.13708031e-01
  2.94791767e-03 2.37477571e-02]
 [1.62648968e-02 6.79191944e-05 1.09474048e-01 ... 2.66785035e-04
  1.33584235e-02 2.47801002e-02]
 [3.66380155e-01 9.42197815e-03 1.76354218e-03 ... 5.07833660e-02
  2.12620548e-03 5.63247204e-02]
 ...
 [3.97009179e-02 4.64956183e-03 1.26892948e-04 ... 3.59809892e-05
  9.03877265e-07 9.52648461e-01]
 [7.04194307e-01 5.93279474e-05 8.15417524e-03 ... 2.69737153e-04
  2.81476132e-05 1.38558194e-01]
 [4.10646439e-01 3.55186639e-04 4.33757996e-05 ... 5.22846631e-06
  3.43866714e-06 5.32206178e-01]]
InΒ [Β ]:
# Get predicted labels
predicted_labels = np.argmax(predictions, axis=1)

# Call the function
accuracy_balanced_model, performance_df_balanced_model = model_performance_classification_keras(
    true_labels=true_labels,
    predicted_probs=predictions,
    class_labels=class_labels
)

# Print overall accuracy
print(f"Overall Accuracy: {accuracy_balanced_model:.2f}")

# Print class-wise performance metrics
print("Class-wise Performance Metrics:")
print(performance_df_balanced_model)
Overall Accuracy: 0.82
Class-wise Performance Metrics:
                        Class    Recall  Precision        F1
0                   apple_pie  0.657895   0.373134  0.476190
1              chocolate_cake  0.810000   0.855634  0.832192
2                      donuts  0.800000   0.845070  0.821918
3                     falafel  0.780000   0.787879  0.783920
4                french_fries  0.923333   0.926421  0.924875
5                     hot_dog  0.856667   0.871186  0.863866
6                   ice_cream  0.826087   0.817881  0.821963
7                      nachos  0.843333   0.671088  0.747415
8                 onion_rings  0.883333   0.943060  0.912220
9                    pancakes  0.760000   0.797203  0.778157
10                      pizza  0.893333   0.899329  0.896321
11                    ravioli  0.806667   0.867384  0.835924
12                     samosa  0.803333   0.803333  0.803333
13               spring_rolls  0.833333   0.850340  0.841751
14       strawberry_shortcake  0.860000   0.803738  0.830918
15                      tacos  0.660000   0.782609  0.716094
16                    waffles  0.760000   0.783505  0.771574
Overall               Overall  0.809254   0.804635  0.803449
InΒ [Β ]:
# Generate confusion matrix
conf_matrix = confusion_matrix(true_labels, predicted_labels)
# Plot confusion matrix
plt.figure(figsize=(12, 10))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
            xticklabels=class_labels,
            yticklabels=class_labels)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()
No description has been provided for this image

ObservationsΒΆ

  • Applying balanced class weights improved the performance of the underrepresented class ("apple_pie") in certain metrics, particularly recall. However, this came at the cost of a slight decrease in precision. Overall, precision, recall, and F1 scores have shown improvement, but there is still room for enhancement through techniques like oversampling and other advanced strategies.

  • 53 'tacos' images got classified as 'nachos', similarly 17 'waffles' got classified as 'strawberry_shortcake', we have other similar examples of wrong classfications.

  • The model now achieves a validation accuracy of 82% (0.82), indicating consistent performance across the dataset.

  • The "onion_rings" class continues to lead in terms of precision (94%), recall (88%), and F1 (91%) score. The second-best performance across these metrics is now observed in the "french_fries" class, followed by "pizza."

Insights and RecommendationsΒΆ

  • Further fine-tune the model with a specific focus on addressing the class imbalance issue related to the "apple_pie" category. Consider techniques such as oversampling the minority class and/or applying targeted data augmentation strategies specifically for this class to improve representation and performance.
  • The model currently achieves a validation accuracy of 82%, with an average recall of 80.92%, precision of 80.46%, and an F1 score of 80.34%. These metrics indicate a solid starting point and suggest that the model is reasonably reliable for classifying food item images.
  • Some images in the dataset appear to be invalid, as they do not visually correspond to their assigned class. During manual annotation, we observed an image labeled as "pancakes" that actually resembled a burger. These outliers negatively impacted model performance. A suitable strategy should be implemented to address such casesβ€”one approach could be to identify and remove these mislabeled images from the dataset.
  • Misclassifications such as "tacos" being predicted as "nachos" and "waffles" as "strawberry_shortcake" indicate areas where the model may be struggling to distinguish between visually similar classes. These cases should be analyzed further, and appropriate steps should be taken to improve class separation and enhance classification accuracy.

Milestone 2ΒΆ

Fine tune the trained basic CNN models to classify the foodΒΆ

Model Building Oversample Minority ClassesΒΆ

Create new model with existing weights.ΒΆ

InΒ [Β ]:
import os
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.applications.efficientnet import preprocess_input
from tensorflow.keras.callbacks import ModelCheckpoint
from sklearn.utils import resample

import json
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from IPython.display import clear_output
%matplotlib inline
from sklearn.metrics import classification_report, confusion_matrix
InΒ [Β ]:
#Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

#Logic to accesss shared folder in Google Drive using shortcuts

#Set Project base path
project_path = '/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/'
#project_path = '/content/drive/MyDrive/shortcuts/Capstone/'
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
InΒ [Β ]:
# Define the path to the zip file and the extraction directory
zip_file_path = project_path + 'Food_101.zip'
food_dir = project_path + 'Food_101'

# Extract the zip file if the directory doesn't exist
if not os.path.exists(food_dir):
    print(f"Extracting '{zip_file_path}' to '{food_dir}'...")
    #with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    #    zip_ref.extractall(project_path)
    print("Extraction complete.")
else:
    print(f"Directory '{food_dir}' already exists. Skipping extraction.")
Directory '/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/Food_101' already exists. Skipping extraction.
InΒ [Β ]:
# Parameters
IMAGE_SIZE = (224, 224)
BATCH_SIZE = 32
NUM_CLASSES = 17
DATA_DIR = food_dir  # Dataset folder with subfolders per class
InΒ [Β ]:
# Directory to save the new model weights
oversampled_checkpoint_dir = os.path.join(project_path, "EfficientNetB0-new-checkpoints")
checkpoint_dir = os.path.join(project_path, "EfficientNetB0-checkpoints")
os.makedirs(oversampled_checkpoint_dir, exist_ok=True)

# Path to save the new model weights
oversampled_checkpoint_path = os.path.join(oversampled_checkpoint_dir, "oversampled_balanced_weights_model_epoch_{epoch:02d}.weights.h5")
InΒ [Β ]:
import os
import shutil
from sklearn.utils import resample

def oversample_data(data_dir, target_classes, target_count):
    """
    Oversample the underrepresented classes to match the target count.

    Args:
        data_dir (str): Path to the dataset directory.
        target_classes (list): List of underrepresented class names.
        target_count (int): Target number of images for each class.

    Returns:
        None
    """
    for class_name in target_classes:
        class_dir = os.path.join(data_dir, class_name)
        if not os.path.exists(class_dir):
            print(f"Class directory {class_dir} not found.")
            continue

        # Get all image paths in the class directory
        image_paths = [os.path.join(class_dir, img) for img in os.listdir(class_dir) if img.lower().endswith(('.png', '.jpg', '.jpeg'))]
        current_count = len(image_paths)

        if current_count < target_count:
            # Oversample by duplicating images
            oversampled_images = resample(image_paths, replace=True, n_samples=target_count - current_count, random_state=42)
            for i, img_path in enumerate(oversampled_images):
                # Copy the image with a new name
                new_img_name = f"oversampled_{i}_{os.path.basename(img_path)}"
                new_img_path = os.path.join(class_dir, new_img_name)
                shutil.copy(img_path, new_img_path)  # Use shutil.copy to copy the file
            print(f"Oversampled {class_name} to {target_count} images.")
        else:
            print(f"{class_name} already has {current_count} images. No oversampling needed.")
InΒ [Β ]:
# Oversample the underrepresented classes
oversample_data(DATA_DIR, target_classes=["apple_pie", "ice_cream"], target_count=1000)
Oversampled apple_pie to 1000 images.
Oversampled ice_cream to 1000 images.
InΒ [Β ]:
# Data augmentation and preprocessing
new_train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    horizontal_flip=True,
    validation_split=0.3
)

# Create a new training data generator
new_train_generator = new_train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='training',
    shuffle=True
)

# Create a new validation data generator
new_validation_generator = new_train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='validation',
    shuffle=False
)
Found 11900 images belonging to 17 classes.
Found 5100 images belonging to 17 classes.
InΒ [Β ]:
# Load EfficientNetB0 base model without top layer
new_base_model = EfficientNetB0(include_top=False, input_shape=IMAGE_SIZE + (3,), weights='imagenet')
new_base_model.trainable = False

# Build the new classification head
inputs = layers.Input(shape=IMAGE_SIZE + (3,))
x = new_base_model(inputs, training=False)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dropout(0.3)(x)
outputs = layers.Dense(NUM_CLASSES, activation='softmax')(x)
oversampled_balanced_weights_model = models.Model(inputs, outputs)

# Compile the new model
oversampled_balanced_weights_model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Load weights from the previous model
previous_weights_path = os.path.join(checkpoint_dir, "balanced_model_epoch_07.weights.h5")
if os.path.exists(previous_weights_path):
    print(f"Loading weights from: {previous_weights_path}")
    oversampled_balanced_weights_model.load_weights(previous_weights_path)
    print('Weights Loaded!')
else:
    raise FileNotFoundError(f"Weights file not found at: {previous_weights_path}")

# Callback to save new model weights at every epoch
oversampled_checkpoint_callback = ModelCheckpoint(
    filepath=oversampled_checkpoint_path,
    save_weights_only=True,
    save_best_only=False,
    verbose=1
)
Loading weights from: /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-checkpoints/balanced_model_epoch_07.weights.h5
Weights Loaded!
/usr/local/lib/python3.11/dist-packages/keras/src/saving/saving_lib.py:757: UserWarning: Skipping variable loading for optimizer 'adam', because it has 2 variables whereas the saved optimizer has 6 variables. 
  saveable.load_own_variables(weights_store.get(inner_path))

Train model with Oversampled minority classesΒΆ

InΒ [Β ]:
# Train the new model
total_epochs = 10
oversampled_balanced_weights_model_history = oversampled_balanced_weights_model.fit(
    new_train_generator,
    validation_data=new_validation_generator,
    epochs=total_epochs,
    callbacks=[oversampled_checkpoint_callback],
    verbose=1
)
/usr/local/lib/python3.11/dist-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
Epoch 1/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 7s/step - accuracy: 0.8220 - loss: 0.5640
Epoch 1: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_01.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 4145s 11s/step - accuracy: 0.8220 - loss: 0.5640 - val_accuracy: 0.8245 - val_loss: 0.5834
Epoch 2/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 515ms/step - accuracy: 0.8220 - loss: 0.5475
Epoch 2: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_02.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 273s 735ms/step - accuracy: 0.8220 - loss: 0.5475 - val_accuracy: 0.8237 - val_loss: 0.5810
Epoch 3/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 517ms/step - accuracy: 0.8389 - loss: 0.5165
Epoch 3: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_03.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 275s 740ms/step - accuracy: 0.8389 - loss: 0.5165 - val_accuracy: 0.8202 - val_loss: 0.5852
Epoch 4/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 516ms/step - accuracy: 0.8346 - loss: 0.5215
Epoch 4: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_04.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 274s 737ms/step - accuracy: 0.8346 - loss: 0.5215 - val_accuracy: 0.8239 - val_loss: 0.5817
Epoch 5/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 513ms/step - accuracy: 0.8343 - loss: 0.5108
Epoch 5: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_05.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 271s 729ms/step - accuracy: 0.8343 - loss: 0.5108 - val_accuracy: 0.8188 - val_loss: 0.5841
Epoch 6/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 513ms/step - accuracy: 0.8360 - loss: 0.5017
Epoch 6: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_06.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 272s 732ms/step - accuracy: 0.8360 - loss: 0.5018 - val_accuracy: 0.8235 - val_loss: 0.5864
Epoch 7/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 512ms/step - accuracy: 0.8409 - loss: 0.5044
Epoch 7: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_07.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 271s 728ms/step - accuracy: 0.8409 - loss: 0.5045 - val_accuracy: 0.8176 - val_loss: 0.5947
Epoch 8/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 509ms/step - accuracy: 0.8341 - loss: 0.5057
Epoch 8: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_08.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 271s 730ms/step - accuracy: 0.8341 - loss: 0.5057 - val_accuracy: 0.8216 - val_loss: 0.5819
Epoch 9/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 508ms/step - accuracy: 0.8356 - loss: 0.5008
Epoch 9: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_09.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 268s 722ms/step - accuracy: 0.8356 - loss: 0.5009 - val_accuracy: 0.8284 - val_loss: 0.5756
Epoch 10/10
372/372 ━━━━━━━━━━━━━━━━━━━━ 0s 508ms/step - accuracy: 0.8311 - loss: 0.5194
Epoch 10: saving model to /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_model_epoch_10.weights.h5
372/372 ━━━━━━━━━━━━━━━━━━━━ 269s 723ms/step - accuracy: 0.8312 - loss: 0.5194 - val_accuracy: 0.8237 - val_loss: 0.5738
InΒ [Β ]:
# Save the new training history
oversampled_history_path = os.path.join(oversampled_checkpoint_dir, "oversampled_balanced_weights_training_history.json")
with open(oversampled_history_path, "w") as f:
    json.dump(oversampled_balanced_weights_model_history.history, f)

print(f"New training history saved to: {oversampled_history_path}")
New training history saved to: /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/EfficientNetB0-new-checkpoints/oversampled_balanced_weights_training_history.json

Evaluate ModelΒΆ

InΒ [Β ]:
# Get true labels from the validation generator
true_labels = new_validation_generator.classes
class_labels = list(new_validation_generator.class_indices.keys())  # Class names
InΒ [Β ]:
# Evaluate the model on the validation dataset
val_loss, val_accuracy = oversampled_balanced_weights_model.evaluate(new_validation_generator)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")

# Use the oversampled_balanced_weights_model for predictions
predictions = oversampled_balanced_weights_model.predict(new_validation_generator)
print(f"Predictions: {predictions}")
160/160 ━━━━━━━━━━━━━━━━━━━━ 83s 520ms/step - accuracy: 0.8367 - loss: 0.5370
Validation Loss: 0.5706741809844971
Validation Accuracy: 0.8264706134796143
160/160 ━━━━━━━━━━━━━━━━━━━━ 92s 533ms/step
Predictions: [[4.76259321e-01 6.97963638e-04 1.73797947e-03 ... 4.70280886e-01
  1.58416363e-03 5.48161194e-03]
 [5.51208377e-01 8.16329612e-06 6.30339310e-02 ... 2.00479426e-05
  1.91765328e-04 1.75134069e-03]
 [7.05581248e-01 1.46976404e-03 6.10832009e-04 ... 2.38315258e-02
  4.25823359e-03 4.15265560e-02]
 ...
 [2.79928632e-02 8.85467278e-04 7.39451498e-05 ... 4.07327889e-06
  1.05954506e-07 9.70229268e-01]
 [9.57819819e-01 1.42157169e-05 1.49355648e-04 ... 3.06297588e-05
  1.19527067e-05 1.23791378e-02]
 [1.09339900e-01 1.49730877e-05 2.38824250e-05 ... 5.23641029e-06
  2.67329706e-05 3.07284534e-01]]
InΒ [Β ]:
# Get predicted labels
predicted_labels = np.argmax(predictions, axis=1)

# Call the function
accuracy_oversampled_balanced_weights_model, performance_df_oversampled_balanced_weights_model = model_performance_classification_keras(
    true_labels=true_labels,
    predicted_probs=predictions,
    class_labels=class_labels
)
InΒ [Β ]:
# Print overall accuracy
print(f"Overall Accuracy: {accuracy_oversampled_balanced_weights_model:.2f}")

# Print class-wise performance metrics
print("Class-wise Performance Metrics:")
print(performance_df_oversampled_balanced_weights_model)
Overall Accuracy: 0.82
Class-wise Performance Metrics:
                        Class    Recall  Precision        F1
0                   apple_pie  0.900000   0.727763  0.804769
1              chocolate_cake  0.863333   0.809375  0.835484
2                      donuts  0.830000   0.819079  0.824503
3                     falafel  0.756667   0.793706  0.774744
4                french_fries  0.893333   0.950355  0.920962
5                     hot_dog  0.846667   0.838284  0.842454
6                   ice_cream  0.790000   0.834507  0.811644
7                      nachos  0.770000   0.764901  0.767442
8                 onion_rings  0.903333   0.924915  0.913997
9                    pancakes  0.740000   0.819188  0.777583
10                      pizza  0.893333   0.920962  0.906937
11                    ravioli  0.850000   0.777439  0.812102
12                     samosa  0.786667   0.890566  0.835398
13               spring_rolls  0.856667   0.831715  0.844007
14       strawberry_shortcake  0.853333   0.820513  0.836601
15                      tacos  0.763333   0.733974  0.748366
16                    waffles  0.723333   0.812734  0.765432
Overall               Overall  0.824706   0.827646  0.824849

Confusion MatrixΒΆ

InΒ [Β ]:
# Generate confusion matrix
oversampled_balanced_weights_model_conf_matrix = confusion_matrix(true_labels, predicted_labels)
# Plot confusion matrix
plt.figure(figsize=(12, 10))
sns.heatmap(oversampled_balanced_weights_model_conf_matrix, annot=True, fmt='d', cmap='Blues',
            xticklabels=class_labels,
            yticklabels=class_labels)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()
No description has been provided for this image

ObservationsΒΆ

  • The model's performance for classes with fewer images has significantly improved.

  • For the apple_pie class, performance improved notably:

    • Recall increased from 0.6579 to 0.9000
    • Precision increased from 0.3731 to 0.7278
    • F1 Score improved from 0.4762
  • For the ice_cream class, Precision improved slightly, while Recall and F1 Score decreased marginally:

    • Recall: 0.8261 β†’ 0.7900
    • Precision: 0.8179 β†’ 0.8220
    • F1 Score: 0.8116 β†’ slight decrease This is expected, as the ice_cream class originally had 993 images and was oversampled to 1000β€”only 7 new images were added.
  • The confusion matrix clearly shows the improvement.

  • Overall, the model's validation performance has improved slightly:

    • Validation Accuracy remains at 82%
    • Average Recall: 80.92% β†’ 82.47%
    • Average Precision: 80.46% β†’ 82.76%
    • F1 Score: 80.34% β†’ 82.48%

    These metrics indicate that the previous model's weakness in identifying underrepresented classes has been effectively addressed.

Food Detection Model CreationΒΆ

InΒ [Β ]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive

Import Required LibrariesΒΆ

  • Rationale: Google Colab at the time of this model building supports TensorFlow version 2.8 and above and doesn't support below versions. I'm using the Detectron library to create RCNN models and their variants. This library is compatible with TensorFlow 2.8+, unlike the Matterport GitHub library for Mask R-CNN (and its forks), which do not support TensorFlow 2.8 and later.
  • Install Detectron Library
InΒ [Β ]:
import os
import numpy as np
import json
import cv2
import random
InΒ [Β ]:
!pip install 'git+https://github.com/facebookresearch/detectron2.git'
Collecting git+https://github.com/facebookresearch/detectron2.git
  Cloning https://github.com/facebookresearch/detectron2.git to /tmp/pip-req-build-_kzwy9mp
  Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/detectron2.git /tmp/pip-req-build-_kzwy9mp
  Resolved https://github.com/facebookresearch/detectron2.git to commit 65184fc057d4fab080a98564f6b60fae0b94edc4
  Preparing metadata (setup.py) ... done
Requirement already satisfied: Pillow>=7.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (11.2.1)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (3.10.0)
Requirement already satisfied: pycocotools>=2.0.2 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.0.9)
Requirement already satisfied: termcolor>=1.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (3.1.0)
Collecting yacs>=0.1.8 (from detectron2==0.6)
  Downloading yacs-0.1.8-py3-none-any.whl.metadata (639 bytes)
Requirement already satisfied: tabulate in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (0.9.0)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (3.1.1)
Requirement already satisfied: tqdm>4.29.0 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (4.67.1)
Requirement already satisfied: tensorboard in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.18.0)
Collecting fvcore<0.1.6,>=0.1.5 (from detectron2==0.6)
  Downloading fvcore-0.1.5.post20221221.tar.gz (50 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.2/50.2 kB 2.4 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting iopath<0.1.10,>=0.1.7 (from detectron2==0.6)
  Downloading iopath-0.1.9-py3-none-any.whl.metadata (370 bytes)
Requirement already satisfied: omegaconf<2.4,>=2.1 in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (2.3.0)
Collecting hydra-core>=1.1 (from detectron2==0.6)
  Downloading hydra_core-1.3.2-py3-none-any.whl.metadata (5.5 kB)
Collecting black (from detectron2==0.6)
  Downloading black-25.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.metadata (81 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.3/81.3 kB 4.1 MB/s eta 0:00:00
Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from detectron2==0.6) (24.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.11/dist-packages (from fvcore<0.1.6,>=0.1.5->detectron2==0.6) (2.0.2)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.11/dist-packages (from fvcore<0.1.6,>=0.1.5->detectron2==0.6) (6.0.2)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /usr/local/lib/python3.11/dist-packages (from hydra-core>=1.1->detectron2==0.6) (4.9.3)
Collecting portalocker (from iopath<0.1.10,>=0.1.7->detectron2==0.6)
  Downloading portalocker-3.1.1-py3-none-any.whl.metadata (8.6 kB)
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.11/dist-packages (from black->detectron2==0.6) (8.2.1)
Collecting mypy-extensions>=0.4.3 (from black->detectron2==0.6)
  Downloading mypy_extensions-1.1.0-py3-none-any.whl.metadata (1.1 kB)
Collecting pathspec>=0.9.0 (from black->detectron2==0.6)
  Downloading pathspec-0.12.1-py3-none-any.whl.metadata (21 kB)
Requirement already satisfied: platformdirs>=2 in /usr/local/lib/python3.11/dist-packages (from black->detectron2==0.6) (4.3.8)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (4.58.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (1.4.8)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib->detectron2==0.6) (2.9.0.post0)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (1.4.0)
Requirement already satisfied: grpcio>=1.48.2 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (1.72.1)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (3.8)
Requirement already satisfied: protobuf!=4.24.0,>=3.19.6 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (5.29.5)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (75.2.0)
Requirement already satisfied: six>1.9 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (1.17.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from tensorboard->detectron2==0.6) (3.1.3)
Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.11/dist-packages (from werkzeug>=1.0.1->tensorboard->detectron2==0.6) (3.0.2)
Downloading hydra_core-1.3.2-py3-none-any.whl (154 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.5/154.5 kB 7.7 MB/s eta 0:00:00
Downloading iopath-0.1.9-py3-none-any.whl (27 kB)
Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Downloading black-25.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 37.8 MB/s eta 0:00:00
Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB)
Downloading pathspec-0.12.1-py3-none-any.whl (31 kB)
Downloading portalocker-3.1.1-py3-none-any.whl (19 kB)
Building wheels for collected packages: detectron2, fvcore
  Building wheel for detectron2 (setup.py) ... done
  Created wheel for detectron2: filename=detectron2-0.6-cp311-cp311-linux_x86_64.whl size=6438669 sha256=18f9785afaa7b72b517fea7a32a31d6a4aae4c8a8a78c0b8abf0112cae398ffc
  Stored in directory: /tmp/pip-ephem-wheel-cache-4gykv41o/wheels/17/d9/40/60db98e485aa9455d653e29d1046601ce96fe23647f60c1c5a
  Building wheel for fvcore (setup.py) ... done
  Created wheel for fvcore: filename=fvcore-0.1.5.post20221221-py3-none-any.whl size=61397 sha256=a6e829af11b6946536ea3bcd80d370854534bd5c9c505d5186d4eddd8b020f3f
  Stored in directory: /root/.cache/pip/wheels/65/71/95/3b8fde5c65c6e4a806e0867c1651dcc71a1cb2f3430e8f355f
Successfully built detectron2 fvcore
Installing collected packages: yacs, portalocker, pathspec, mypy-extensions, iopath, hydra-core, black, fvcore, detectron2
Successfully installed black-25.1.0 detectron2-0.6 fvcore-0.1.5.post20221221 hydra-core-1.3.2 iopath-0.1.9 mypy-extensions-1.1.0 pathspec-0.12.1 portalocker-3.1.1 yacs-0.1.8

Faster RCNN ModelΒΆ

InΒ [Β ]:
#Food Detection model using Mask RCNN
import os
import detectron2
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2.data import DatasetCatalog, MetadataCatalog
from detectron2.data.datasets import register_coco_instances
from detectron2.utils.visualizer import Visualizer
from detectron2 import model_zoo
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.engine import DefaultPredictor
from detectron2.data import build_detection_test_loader
import cv2
import matplotlib.pyplot as plt
import random
import pickle
InΒ [Β ]:
# Paths to the dataset
#captsone_project_path = "/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/"
captsone_project_path = "/content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/"
#captsone_project_path = "/content/drive/MyDrive/shortcuts/Capstone/"
train_images_path = captsone_project_path + "fooddetection-cap-cv3-may24b.v7i.coco/train"
val_images_path = captsone_project_path + "fooddetection-cap-cv3-may24b.v7i.coco/test"
train_annotations_path = train_images_path + "/_annotations.coco.json"
val_annotations_path = val_images_path + "/_annotations.coco.json"
InΒ [Β ]:
# Register the dataset in Detectron2
register_coco_instances("food_train", {}, train_annotations_path, train_images_path)
register_coco_instances("food_val", {}, val_annotations_path, val_images_path)

# Load metadata
food_metadata = MetadataCatalog.get("food_train")

# Visualize a few samples from the dataset
def visualize_samples(dataset_name, metadata, num_samples=5):
    dataset_dicts = DatasetCatalog.get(dataset_name)
    random.shuffle(dataset_dicts)  # Shuffle the dataset to show random images
    for d in dataset_dicts[:num_samples]:
        img = cv2.cvtColor(cv2.imread(d["file_name"]), cv2.COLOR_BGR2RGB)  # Convert BGR to RGB
        visualizer = Visualizer(img, metadata=metadata, scale=0.5)
        vis = visualizer.draw_dataset_dict(d)

        plt.figure(figsize=(4, 4))
        plt.axis("off")
        plt.imshow(vis.get_image())
        plt.show()


visualize_samples("food_train", food_metadata)
WARNING:detectron2.data.datasets.coco:
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
InΒ [Β ]:
# Configure the Mask R-CNN model for object detection
cfg_fast_rcnn = get_cfg()
# Get Faster RCNN Config from detectron's model_zoo
cfg_fast_rcnn.merge_from_file(
    detectron2.model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
)  # Use Faster R-CNN backbone for object detection
cfg_fast_rcnn.DATASETS.TRAIN = ("food_train",)
cfg_fast_rcnn.DATASETS.TEST = ("food_val",)
cfg_fast_rcnn.DATALOADER.NUM_WORKERS = 4
cfg_fast_rcnn.MODEL.WEIGHTS = detectron2.model_zoo.get_checkpoint_url(
    "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
)  # Pretrained weights
cfg_fast_rcnn.SOLVER.IMS_PER_BATCH = 2
cfg_fast_rcnn.SOLVER.BASE_LR = 0.00025  # Learning rate
cfg_fast_rcnn.SOLVER.MAX_ITER = 3000  # Adjust based on dataset size
cfg_fast_rcnn.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg_fast_rcnn.MODEL.ROI_HEADS.NUM_CLASSES = len(food_metadata.thing_classes)  # Number of classes
cfg_fast_rcnn.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Threshold for predictions
# Disable mask prediction (only bounding box detection)
cfg_fast_rcnn.MODEL.MASK_ON = False

Augmentation - Detectron performs some augmentations by default

  • ResizeShortestEdge resizes images so the shortest side is within a range, e.g., 800 pixels)

  • RandomFlip (horizontal flip with 50% probability)

  • Rootation and shear cannot be applied due to bouding box

Training the ModelΒΆ

InΒ [Β ]:
# Output directory
cfg_fast_rcnn.OUTPUT_DIR = captsone_project_path + "output_food_detection_faster_rcnn"
os.makedirs(cfg_fast_rcnn.OUTPUT_DIR, exist_ok=True)

# Train the model
trainer = DefaultTrainer(cfg_fast_rcnn)
trainer.resume_or_load(resume=False)
trainer.train()
[06/11 20:33:26 d2.engine.defaults]: Model:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
      )
      (res2): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv1): Conv2d(
            64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv2): Conv2d(
            64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv3): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv2): Conv2d(
            64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv3): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv2): Conv2d(
            64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv3): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
      )
      (res3): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv1): Conv2d(
            256, 128, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (3): BottleneckBlock(
          (conv1): Conv2d(
            512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
      )
      (res4): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv1): Conv2d(
            512, 256, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (3): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (4): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (5): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
      )
      (res5): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv1): Conv2d(
            1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
      )
    )
  )
  (proposal_generator): RPN(
    (rpn_head): StandardRPNHead(
      (conv): Conv2d(
        256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)
        (activation): ReLU()
      )
      (objectness_logits): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))
      (anchor_deltas): Conv2d(256, 12, kernel_size=(1, 1), stride=(1, 1))
    )
    (anchor_generator): DefaultAnchorGenerator(
      (cell_anchors): BufferList()
    )
  )
  (roi_heads): StandardROIHeads(
    (box_pooler): ROIPooler(
      (level_poolers): ModuleList(
        (0): ROIAlign(output_size=(7, 7), spatial_scale=0.25, sampling_ratio=0, aligned=True)
        (1): ROIAlign(output_size=(7, 7), spatial_scale=0.125, sampling_ratio=0, aligned=True)
        (2): ROIAlign(output_size=(7, 7), spatial_scale=0.0625, sampling_ratio=0, aligned=True)
        (3): ROIAlign(output_size=(7, 7), spatial_scale=0.03125, sampling_ratio=0, aligned=True)
      )
    )
    (box_head): FastRCNNConvFCHead(
      (flatten): Flatten(start_dim=1, end_dim=-1)
      (fc1): Linear(in_features=12544, out_features=1024, bias=True)
      (fc_relu1): ReLU()
      (fc2): Linear(in_features=1024, out_features=1024, bias=True)
      (fc_relu2): ReLU()
    )
    (box_predictor): FastRCNNOutputLayers(
      (cls_score): Linear(in_features=1024, out_features=14, bias=True)
      (bbox_pred): Linear(in_features=1024, out_features=52, bias=True)
    )
  )
)
WARNING [06/11 20:33:26 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 20:33:26 d2.data.datasets.coco]: Loaded 420 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/train/_annotations.coco.json
[06/11 20:33:26 d2.data.build]: Removed 1 images with no usable annotations. 419 images left.
[06/11 20:33:26 d2.data.build]: Distribution of instances among all 13 categories:
|   category    | #instances   |  category   | #instances   |   category    | #instances   |
|:-------------:|:-------------|:-----------:|:-------------|:-------------:|:-------------|
| food-detect.. | 0            |  apple_pie  | 35           | chocolate_c.. | 37           |
| french_fries  | 35           |   hot_dog   | 39           |   ice_cream   | 41           |
|    nachos     | 35           | onion_rings | 45           |   pancakes    | 35           |
|     pizza     | 41           |   ravioli   | 35           |    samosa     | 34           |
| spring_rolls  | 35           |             |              |               |              |
|     total     | 447          |             |              |               |              |
[06/11 20:33:26 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[06/11 20:33:26 d2.data.build]: Using training sampler TrainingSampler
[06/11 20:33:26 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 20:33:26 d2.data.common]: Serializing 419 elements to byte tensors and concatenating them all ...
[06/11 20:33:26 d2.data.common]: Serialized dataset takes 0.14 MiB
[06/11 20:33:26 d2.data.build]: Making batched data loader with batch_size=2
WARNING [06/11 20:33:26 d2.solver.build]: SOLVER.STEPS contains values larger than SOLVER.MAX_ITER. These values will be ignored.
[06/11 20:33:26 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl ...
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
model_final_280758.pkl: 167MB [00:01, 105MB/s]                           
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (14, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (14,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (52, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (52,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
[06/11 20:33:28 d2.engine.train_loop]: Starting training from iteration 0
/usr/local/lib/python3.11/dist-packages/torch/functional.py:539: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:3637.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[06/11 20:33:38 d2.utils.events]:  eta: 0:14:01  iter: 19  total_loss: 3.025  loss_cls: 2.494  loss_box_reg: 0.4933  loss_rpn_cls: 0.004579  loss_rpn_loc: 0.006831    time: 0.2872  last_time: 0.3118  data_time: 0.0391  last_data_time: 0.0052   lr: 4.9953e-06  max_mem: 1740M
[06/11 20:33:48 d2.utils.events]:  eta: 0:13:48  iter: 39  total_loss: 2.85  loss_cls: 2.353  loss_box_reg: 0.459  loss_rpn_cls: 0.007624  loss_rpn_loc: 0.006427    time: 0.2869  last_time: 0.2190  data_time: 0.0078  last_data_time: 0.0041   lr: 9.9902e-06  max_mem: 1740M
[06/11 20:33:54 d2.utils.events]:  eta: 0:13:48  iter: 59  total_loss: 2.481  loss_cls: 2.041  loss_box_reg: 0.4611  loss_rpn_cls: 0.01153  loss_rpn_loc: 0.005852    time: 0.2861  last_time: 0.2710  data_time: 0.0076  last_data_time: 0.0053   lr: 1.4985e-05  max_mem: 1740M
[06/11 20:34:00 d2.utils.events]:  eta: 0:13:51  iter: 79  total_loss: 2.069  loss_cls: 1.547  loss_box_reg: 0.4625  loss_rpn_cls: 0.003853  loss_rpn_loc: 0.007155    time: 0.2920  last_time: 0.3268  data_time: 0.0074  last_data_time: 0.0052   lr: 1.998e-05  max_mem: 1741M
[06/11 20:34:06 d2.utils.events]:  eta: 0:13:49  iter: 99  total_loss: 1.607  loss_cls: 1.123  loss_box_reg: 0.4994  loss_rpn_cls: 0.006247  loss_rpn_loc: 0.006993    time: 0.2898  last_time: 0.2378  data_time: 0.0050  last_data_time: 0.0039   lr: 2.4975e-05  max_mem: 1741M
[06/11 20:34:12 d2.utils.events]:  eta: 0:13:50  iter: 119  total_loss: 1.27  loss_cls: 0.7018  loss_box_reg: 0.5265  loss_rpn_cls: 0.004479  loss_rpn_loc: 0.006746    time: 0.2931  last_time: 0.2775  data_time: 0.0082  last_data_time: 0.0050   lr: 2.997e-05  max_mem: 1741M
[06/11 20:34:18 d2.utils.events]:  eta: 0:13:43  iter: 139  total_loss: 0.9585  loss_cls: 0.5244  loss_box_reg: 0.4049  loss_rpn_cls: 0.003977  loss_rpn_loc: 0.006713    time: 0.2921  last_time: 0.2858  data_time: 0.0062  last_data_time: 0.0055   lr: 3.4965e-05  max_mem: 1741M
[06/11 20:34:24 d2.utils.events]:  eta: 0:13:39  iter: 159  total_loss: 0.9319  loss_cls: 0.4998  loss_box_reg: 0.434  loss_rpn_cls: 0.002519  loss_rpn_loc: 0.006733    time: 0.2946  last_time: 0.2666  data_time: 0.0062  last_data_time: 0.0056   lr: 3.996e-05  max_mem: 1741M
[06/11 20:34:30 d2.utils.events]:  eta: 0:13:35  iter: 179  total_loss: 0.9516  loss_cls: 0.4927  loss_box_reg: 0.4497  loss_rpn_cls: 0.001721  loss_rpn_loc: 0.005895    time: 0.2944  last_time: 0.2948  data_time: 0.0052  last_data_time: 0.0048   lr: 4.4955e-05  max_mem: 1741M
[06/11 20:34:36 d2.utils.events]:  eta: 0:13:30  iter: 199  total_loss: 0.8713  loss_cls: 0.4329  loss_box_reg: 0.4246  loss_rpn_cls: 0.002531  loss_rpn_loc: 0.005808    time: 0.2949  last_time: 0.3113  data_time: 0.0053  last_data_time: 0.0065   lr: 4.995e-05  max_mem: 1741M
[06/11 20:34:42 d2.utils.events]:  eta: 0:13:28  iter: 219  total_loss: 1.099  loss_cls: 0.553  loss_box_reg: 0.5354  loss_rpn_cls: 0.002099  loss_rpn_loc: 0.005386    time: 0.2956  last_time: 0.3325  data_time: 0.0083  last_data_time: 0.0052   lr: 5.4945e-05  max_mem: 1741M
[06/11 20:34:48 d2.utils.events]:  eta: 0:13:23  iter: 239  total_loss: 0.9879  loss_cls: 0.4953  loss_box_reg: 0.4726  loss_rpn_cls: 0.002988  loss_rpn_loc: 0.005118    time: 0.2961  last_time: 0.3346  data_time: 0.0053  last_data_time: 0.0045   lr: 5.994e-05  max_mem: 1741M
[06/11 20:34:54 d2.utils.events]:  eta: 0:13:19  iter: 259  total_loss: 1.106  loss_cls: 0.5185  loss_box_reg: 0.5503  loss_rpn_cls: 0.002747  loss_rpn_loc: 0.005175    time: 0.2975  last_time: 0.3389  data_time: 0.0078  last_data_time: 0.0049   lr: 6.4935e-05  max_mem: 1741M
[06/11 20:35:00 d2.utils.events]:  eta: 0:13:15  iter: 279  total_loss: 0.8869  loss_cls: 0.4419  loss_box_reg: 0.4401  loss_rpn_cls: 0.00371  loss_rpn_loc: 0.005696    time: 0.2976  last_time: 0.3048  data_time: 0.0048  last_data_time: 0.0046   lr: 6.993e-05  max_mem: 1741M
[06/11 20:35:07 d2.utils.events]:  eta: 0:13:19  iter: 299  total_loss: 0.7982  loss_cls: 0.3784  loss_box_reg: 0.3991  loss_rpn_cls: 0.001867  loss_rpn_loc: 0.00571    time: 0.2997  last_time: 0.2752  data_time: 0.0108  last_data_time: 0.0096   lr: 7.4925e-05  max_mem: 1741M
[06/11 20:35:13 d2.utils.events]:  eta: 0:13:13  iter: 319  total_loss: 0.8444  loss_cls: 0.415  loss_box_reg: 0.4  loss_rpn_cls: 0.000636  loss_rpn_loc: 0.005216    time: 0.3000  last_time: 0.2739  data_time: 0.0050  last_data_time: 0.0065   lr: 7.992e-05  max_mem: 1741M
[06/11 20:35:20 d2.utils.events]:  eta: 0:13:09  iter: 339  total_loss: 0.9434  loss_cls: 0.4486  loss_box_reg: 0.487  loss_rpn_cls: 0.0009363  loss_rpn_loc: 0.004064    time: 0.3017  last_time: 0.2800  data_time: 0.0143  last_data_time: 0.0053   lr: 8.4915e-05  max_mem: 1741M
[06/11 20:35:26 d2.utils.events]:  eta: 0:13:06  iter: 359  total_loss: 0.7534  loss_cls: 0.3729  loss_box_reg: 0.3814  loss_rpn_cls: 0.002124  loss_rpn_loc: 0.004212    time: 0.3024  last_time: 0.3195  data_time: 0.0048  last_data_time: 0.0045   lr: 8.991e-05  max_mem: 1741M
[06/11 20:35:32 d2.utils.events]:  eta: 0:13:04  iter: 379  total_loss: 0.7075  loss_cls: 0.3334  loss_box_reg: 0.3776  loss_rpn_cls: 0.001112  loss_rpn_loc: 0.005579    time: 0.3033  last_time: 0.3533  data_time: 0.0067  last_data_time: 0.0254   lr: 9.4905e-05  max_mem: 1741M
[06/11 20:35:39 d2.utils.events]:  eta: 0:13:00  iter: 399  total_loss: 0.9504  loss_cls: 0.422  loss_box_reg: 0.4985  loss_rpn_cls: 0.001048  loss_rpn_loc: 0.005298    time: 0.3040  last_time: 0.2876  data_time: 0.0051  last_data_time: 0.0048   lr: 9.99e-05  max_mem: 1741M
[06/11 20:35:45 d2.utils.events]:  eta: 0:12:57  iter: 419  total_loss: 0.8156  loss_cls: 0.3696  loss_box_reg: 0.4129  loss_rpn_cls: 0.001428  loss_rpn_loc: 0.00598    time: 0.3052  last_time: 0.3706  data_time: 0.0088  last_data_time: 0.0295   lr: 0.0001049  max_mem: 1741M
[06/11 20:35:52 d2.utils.events]:  eta: 0:12:55  iter: 439  total_loss: 0.8527  loss_cls: 0.3963  loss_box_reg: 0.4519  loss_rpn_cls: 0.0009681  loss_rpn_loc: 0.004308    time: 0.3057  last_time: 0.3445  data_time: 0.0067  last_data_time: 0.0050   lr: 0.00010989  max_mem: 1741M
[06/11 20:35:58 d2.utils.events]:  eta: 0:12:47  iter: 459  total_loss: 0.8639  loss_cls: 0.4039  loss_box_reg: 0.4699  loss_rpn_cls: 0.001506  loss_rpn_loc: 0.005473    time: 0.3053  last_time: 0.3346  data_time: 0.0050  last_data_time: 0.0075   lr: 0.00011489  max_mem: 1741M
[06/11 20:36:04 d2.utils.events]:  eta: 0:12:43  iter: 479  total_loss: 0.9197  loss_cls: 0.4147  loss_box_reg: 0.4703  loss_rpn_cls: 0.001106  loss_rpn_loc: 0.003868    time: 0.3055  last_time: 0.3063  data_time: 0.0055  last_data_time: 0.0053   lr: 0.00011988  max_mem: 1741M
[06/11 20:36:10 d2.utils.events]:  eta: 0:12:37  iter: 499  total_loss: 0.8988  loss_cls: 0.389  loss_box_reg: 0.4868  loss_rpn_cls: 0.0009406  loss_rpn_loc: 0.004928    time: 0.3056  last_time: 0.2989  data_time: 0.0050  last_data_time: 0.0055   lr: 0.00012488  max_mem: 1741M
[06/11 20:36:16 d2.utils.events]:  eta: 0:12:33  iter: 519  total_loss: 0.9875  loss_cls: 0.4343  loss_box_reg: 0.5373  loss_rpn_cls: 0.002393  loss_rpn_loc: 0.005588    time: 0.3061  last_time: 0.2593  data_time: 0.0095  last_data_time: 0.0045   lr: 0.00012987  max_mem: 1741M
[06/11 20:36:22 d2.utils.events]:  eta: 0:12:29  iter: 539  total_loss: 0.8312  loss_cls: 0.3743  loss_box_reg: 0.4504  loss_rpn_cls: 0.000758  loss_rpn_loc: 0.006799    time: 0.3063  last_time: 0.3133  data_time: 0.0058  last_data_time: 0.0049   lr: 0.00013487  max_mem: 1741M
[06/11 20:36:29 d2.utils.events]:  eta: 0:12:23  iter: 559  total_loss: 0.7929  loss_cls: 0.3586  loss_box_reg: 0.4137  loss_rpn_cls: 0.00103  loss_rpn_loc: 0.00767    time: 0.3067  last_time: 0.3064  data_time: 0.0074  last_data_time: 0.0045   lr: 0.00013986  max_mem: 1741M
[06/11 20:36:35 d2.utils.events]:  eta: 0:12:17  iter: 579  total_loss: 0.7283  loss_cls: 0.3237  loss_box_reg: 0.3886  loss_rpn_cls: 0.0006481  loss_rpn_loc: 0.007302    time: 0.3067  last_time: 0.3053  data_time: 0.0052  last_data_time: 0.0045   lr: 0.00014486  max_mem: 1741M
[06/11 20:36:41 d2.utils.events]:  eta: 0:12:11  iter: 599  total_loss: 0.8567  loss_cls: 0.387  loss_box_reg: 0.4653  loss_rpn_cls: 0.001215  loss_rpn_loc: 0.006119    time: 0.3068  last_time: 0.2994  data_time: 0.0064  last_data_time: 0.0049   lr: 0.00014985  max_mem: 1741M
[06/11 20:36:47 d2.utils.events]:  eta: 0:12:07  iter: 619  total_loss: 0.8823  loss_cls: 0.4077  loss_box_reg: 0.4411  loss_rpn_cls: 0.001151  loss_rpn_loc: 0.005647    time: 0.3069  last_time: 0.3060  data_time: 0.0054  last_data_time: 0.0047   lr: 0.00015485  max_mem: 1741M
[06/11 20:36:54 d2.utils.events]:  eta: 0:12:03  iter: 639  total_loss: 1.012  loss_cls: 0.458  loss_box_reg: 0.5275  loss_rpn_cls: 0.0006174  loss_rpn_loc: 0.006036    time: 0.3073  last_time: 0.3127  data_time: 0.0081  last_data_time: 0.0053   lr: 0.00015984  max_mem: 1741M
[06/11 20:37:00 d2.utils.events]:  eta: 0:11:56  iter: 659  total_loss: 0.8073  loss_cls: 0.3556  loss_box_reg: 0.4435  loss_rpn_cls: 0.0003938  loss_rpn_loc: 0.005126    time: 0.3068  last_time: 0.2983  data_time: 0.0052  last_data_time: 0.0063   lr: 0.00016484  max_mem: 1741M
[06/11 20:37:06 d2.utils.events]:  eta: 0:11:51  iter: 679  total_loss: 0.9049  loss_cls: 0.4423  loss_box_reg: 0.4713  loss_rpn_cls: 0.0004223  loss_rpn_loc: 0.00563    time: 0.3070  last_time: 0.3122  data_time: 0.0070  last_data_time: 0.0057   lr: 0.00016983  max_mem: 1741M
[06/11 20:37:12 d2.utils.events]:  eta: 0:11:44  iter: 699  total_loss: 0.9817  loss_cls: 0.4362  loss_box_reg: 0.5196  loss_rpn_cls: 0.0006572  loss_rpn_loc: 0.007235    time: 0.3068  last_time: 0.2957  data_time: 0.0050  last_data_time: 0.0060   lr: 0.00017483  max_mem: 1741M
[06/11 20:37:18 d2.utils.events]:  eta: 0:11:38  iter: 719  total_loss: 0.7555  loss_cls: 0.3706  loss_box_reg: 0.3968  loss_rpn_cls: 0.0004505  loss_rpn_loc: 0.007704    time: 0.3069  last_time: 0.3626  data_time: 0.0057  last_data_time: 0.0048   lr: 0.00017982  max_mem: 1741M
[06/11 20:37:25 d2.utils.events]:  eta: 0:11:34  iter: 739  total_loss: 0.8919  loss_cls: 0.3959  loss_box_reg: 0.4759  loss_rpn_cls: 0.000874  loss_rpn_loc: 0.005845    time: 0.3074  last_time: 0.2960  data_time: 0.0079  last_data_time: 0.0047   lr: 0.00018482  max_mem: 1741M
[06/11 20:37:31 d2.utils.events]:  eta: 0:11:27  iter: 759  total_loss: 0.9616  loss_cls: 0.4604  loss_box_reg: 0.478  loss_rpn_cls: 0.0001358  loss_rpn_loc: 0.00686    time: 0.3072  last_time: 0.3439  data_time: 0.0050  last_data_time: 0.0046   lr: 0.00018981  max_mem: 1741M
[06/11 20:37:37 d2.utils.events]:  eta: 0:11:22  iter: 779  total_loss: 0.8964  loss_cls: 0.4482  loss_box_reg: 0.4127  loss_rpn_cls: 0.0002859  loss_rpn_loc: 0.008684    time: 0.3075  last_time: 0.3444  data_time: 0.0081  last_data_time: 0.0054   lr: 0.00019481  max_mem: 1741M
[06/11 20:37:43 d2.utils.events]:  eta: 0:11:16  iter: 799  total_loss: 0.8738  loss_cls: 0.5057  loss_box_reg: 0.4574  loss_rpn_cls: 7.314e-05  loss_rpn_loc: 0.006098    time: 0.3075  last_time: 0.3416  data_time: 0.0049  last_data_time: 0.0048   lr: 0.0001998  max_mem: 1741M
[06/11 20:37:50 d2.utils.events]:  eta: 0:11:11  iter: 819  total_loss: 0.7175  loss_cls: 0.3771  loss_box_reg: 0.3554  loss_rpn_cls: 0.0008924  loss_rpn_loc: 0.006288    time: 0.3078  last_time: 0.2599  data_time: 0.0071  last_data_time: 0.0050   lr: 0.0002048  max_mem: 1741M
[06/11 20:37:55 d2.utils.events]:  eta: 0:11:04  iter: 839  total_loss: 0.8329  loss_cls: 0.3841  loss_box_reg: 0.3633  loss_rpn_cls: 0.003485  loss_rpn_loc: 0.005963    time: 0.3075  last_time: 0.3074  data_time: 0.0051  last_data_time: 0.0051   lr: 0.00020979  max_mem: 1741M
[06/11 20:38:02 d2.utils.events]:  eta: 0:10:59  iter: 859  total_loss: 0.659  loss_cls: 0.3083  loss_box_reg: 0.3013  loss_rpn_cls: 0.0003566  loss_rpn_loc: 0.005367    time: 0.3077  last_time: 0.2984  data_time: 0.0072  last_data_time: 0.0067   lr: 0.00021479  max_mem: 1741M
[06/11 20:38:08 d2.utils.events]:  eta: 0:10:53  iter: 879  total_loss: 0.6725  loss_cls: 0.3305  loss_box_reg: 0.332  loss_rpn_cls: 0.0004108  loss_rpn_loc: 0.006597    time: 0.3078  last_time: 0.3084  data_time: 0.0050  last_data_time: 0.0054   lr: 0.00021978  max_mem: 1741M
[06/11 20:38:14 d2.utils.events]:  eta: 0:10:47  iter: 899  total_loss: 0.6468  loss_cls: 0.3009  loss_box_reg: 0.3399  loss_rpn_cls: 0.0001286  loss_rpn_loc: 0.006254    time: 0.3081  last_time: 0.3223  data_time: 0.0075  last_data_time: 0.0050   lr: 0.00022478  max_mem: 1741M
[06/11 20:38:21 d2.utils.events]:  eta: 0:10:41  iter: 919  total_loss: 0.7455  loss_cls: 0.3679  loss_box_reg: 0.3099  loss_rpn_cls: 0.0003345  loss_rpn_loc: 0.007432    time: 0.3082  last_time: 0.3088  data_time: 0.0060  last_data_time: 0.0050   lr: 0.00022977  max_mem: 1741M
[06/11 20:38:27 d2.utils.events]:  eta: 0:10:35  iter: 939  total_loss: 0.6254  loss_cls: 0.3381  loss_box_reg: 0.2799  loss_rpn_cls: 0.0001089  loss_rpn_loc: 0.004873    time: 0.3084  last_time: 0.3616  data_time: 0.0074  last_data_time: 0.0052   lr: 0.00023477  max_mem: 1741M
[06/11 20:38:34 d2.utils.events]:  eta: 0:10:29  iter: 959  total_loss: 0.5635  loss_cls: 0.2898  loss_box_reg: 0.2433  loss_rpn_cls: 0.0002253  loss_rpn_loc: 0.007171    time: 0.3087  last_time: 0.3141  data_time: 0.0064  last_data_time: 0.0049   lr: 0.00023976  max_mem: 1741M
[06/11 20:38:40 d2.utils.events]:  eta: 0:10:24  iter: 979  total_loss: 0.6049  loss_cls: 0.2994  loss_box_reg: 0.3102  loss_rpn_cls: 0.0001145  loss_rpn_loc: 0.005947    time: 0.3089  last_time: 0.3300  data_time: 0.0065  last_data_time: 0.0081   lr: 0.00024476  max_mem: 1741M
[06/11 20:38:46 d2.utils.events]:  eta: 0:10:18  iter: 999  total_loss: 0.7097  loss_cls: 0.2944  loss_box_reg: 0.322  loss_rpn_cls: 0.0004589  loss_rpn_loc: 0.006573    time: 0.3089  last_time: 0.3012  data_time: 0.0080  last_data_time: 0.0048   lr: 0.00024975  max_mem: 1741M
[06/11 20:38:52 d2.utils.events]:  eta: 0:10:12  iter: 1019  total_loss: 0.6394  loss_cls: 0.337  loss_box_reg: 0.263  loss_rpn_cls: 9.904e-05  loss_rpn_loc: 0.005514    time: 0.3089  last_time: 0.3640  data_time: 0.0055  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:38:59 d2.utils.events]:  eta: 0:10:06  iter: 1039  total_loss: 0.6053  loss_cls: 0.3069  loss_box_reg: 0.2671  loss_rpn_cls: 0.0001465  loss_rpn_loc: 0.005691    time: 0.3091  last_time: 0.2592  data_time: 0.0075  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:39:05 d2.utils.events]:  eta: 0:10:00  iter: 1059  total_loss: 0.4954  loss_cls: 0.2446  loss_box_reg: 0.2582  loss_rpn_cls: 0.0003017  loss_rpn_loc: 0.005089    time: 0.3090  last_time: 0.3100  data_time: 0.0050  last_data_time: 0.0055   lr: 0.00025  max_mem: 1741M
[06/11 20:39:11 d2.utils.events]:  eta: 0:09:54  iter: 1079  total_loss: 0.5243  loss_cls: 0.2357  loss_box_reg: 0.2841  loss_rpn_cls: 0.0007255  loss_rpn_loc: 0.004742    time: 0.3092  last_time: 0.2724  data_time: 0.0089  last_data_time: 0.0053   lr: 0.00025  max_mem: 1741M
[06/11 20:39:17 d2.utils.events]:  eta: 0:09:48  iter: 1099  total_loss: 0.4944  loss_cls: 0.2349  loss_box_reg: 0.2492  loss_rpn_cls: 0.0002902  loss_rpn_loc: 0.006378    time: 0.3092  last_time: 0.3108  data_time: 0.0050  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:39:24 d2.utils.events]:  eta: 0:09:43  iter: 1119  total_loss: 0.6454  loss_cls: 0.3468  loss_box_reg: 0.2843  loss_rpn_cls: 0.001949  loss_rpn_loc: 0.004964    time: 0.3096  last_time: 0.3425  data_time: 0.0061  last_data_time: 0.0061   lr: 0.00025  max_mem: 1741M
[06/11 20:39:30 d2.utils.events]:  eta: 0:09:37  iter: 1139  total_loss: 0.475  loss_cls: 0.2638  loss_box_reg: 0.2413  loss_rpn_cls: 4.916e-05  loss_rpn_loc: 0.006534    time: 0.3096  last_time: 0.3105  data_time: 0.0053  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:39:37 d2.utils.events]:  eta: 0:09:31  iter: 1159  total_loss: 0.5332  loss_cls: 0.244  loss_box_reg: 0.2518  loss_rpn_cls: 0.0003242  loss_rpn_loc: 0.005525    time: 0.3099  last_time: 0.2971  data_time: 0.0072  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:39:43 d2.utils.events]:  eta: 0:09:25  iter: 1179  total_loss: 0.5827  loss_cls: 0.291  loss_box_reg: 0.298  loss_rpn_cls: 0.0002353  loss_rpn_loc: 0.00518    time: 0.3098  last_time: 0.3094  data_time: 0.0052  last_data_time: 0.0066   lr: 0.00025  max_mem: 1741M
[06/11 20:39:49 d2.utils.events]:  eta: 0:09:19  iter: 1199  total_loss: 0.4459  loss_cls: 0.1968  loss_box_reg: 0.2495  loss_rpn_cls: 0.0001281  loss_rpn_loc: 0.006202    time: 0.3100  last_time: 0.3511  data_time: 0.0059  last_data_time: 0.0057   lr: 0.00025  max_mem: 1741M
[06/11 20:39:55 d2.utils.events]:  eta: 0:09:12  iter: 1219  total_loss: 0.5307  loss_cls: 0.2474  loss_box_reg: 0.2545  loss_rpn_cls: 0.0003785  loss_rpn_loc: 0.006063    time: 0.3098  last_time: 0.2786  data_time: 0.0053  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:40:02 d2.utils.events]:  eta: 0:09:06  iter: 1239  total_loss: 0.4307  loss_cls: 0.2059  loss_box_reg: 0.1867  loss_rpn_cls: 0.001089  loss_rpn_loc: 0.004802    time: 0.3099  last_time: 0.3297  data_time: 0.0059  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:40:08 d2.utils.events]:  eta: 0:09:00  iter: 1259  total_loss: 0.4574  loss_cls: 0.228  loss_box_reg: 0.2153  loss_rpn_cls: 0.0001665  loss_rpn_loc: 0.004766    time: 0.3099  last_time: 0.3483  data_time: 0.0068  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:40:14 d2.utils.events]:  eta: 0:08:54  iter: 1279  total_loss: 0.5106  loss_cls: 0.2105  loss_box_reg: 0.2712  loss_rpn_cls: 0.0006925  loss_rpn_loc: 0.005501    time: 0.3097  last_time: 0.3234  data_time: 0.0055  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:40:20 d2.utils.events]:  eta: 0:08:48  iter: 1299  total_loss: 0.4349  loss_cls: 0.1927  loss_box_reg: 0.2491  loss_rpn_cls: 0.0005387  loss_rpn_loc: 0.004978    time: 0.3098  last_time: 0.3114  data_time: 0.0072  last_data_time: 0.0053   lr: 0.00025  max_mem: 1741M
[06/11 20:40:26 d2.utils.events]:  eta: 0:08:41  iter: 1319  total_loss: 0.4157  loss_cls: 0.1926  loss_box_reg: 0.1998  loss_rpn_cls: 6.382e-05  loss_rpn_loc: 0.00632    time: 0.3096  last_time: 0.2930  data_time: 0.0047  last_data_time: 0.0053   lr: 0.00025  max_mem: 1741M
[06/11 20:40:33 d2.utils.events]:  eta: 0:08:35  iter: 1339  total_loss: 0.5027  loss_cls: 0.2794  loss_box_reg: 0.2363  loss_rpn_cls: 2.279e-05  loss_rpn_loc: 0.005562    time: 0.3098  last_time: 0.2732  data_time: 0.0114  last_data_time: 0.0048   lr: 0.00025  max_mem: 1741M
[06/11 20:40:39 d2.utils.events]:  eta: 0:08:29  iter: 1359  total_loss: 0.3496  loss_cls: 0.1846  loss_box_reg: 0.1563  loss_rpn_cls: 0.0001179  loss_rpn_loc: 0.004049    time: 0.3099  last_time: 0.2733  data_time: 0.0050  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:40:45 d2.utils.events]:  eta: 0:08:22  iter: 1379  total_loss: 0.457  loss_cls: 0.204  loss_box_reg: 0.1922  loss_rpn_cls: 0.0001944  loss_rpn_loc: 0.005395    time: 0.3101  last_time: 0.3095  data_time: 0.0116  last_data_time: 0.0048   lr: 0.00025  max_mem: 1741M
[06/11 20:40:51 d2.utils.events]:  eta: 0:08:16  iter: 1399  total_loss: 0.4583  loss_cls: 0.2052  loss_box_reg: 0.1925  loss_rpn_cls: 4.584e-05  loss_rpn_loc: 0.004583    time: 0.3099  last_time: 0.3456  data_time: 0.0047  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:40:58 d2.utils.events]:  eta: 0:08:10  iter: 1419  total_loss: 0.3578  loss_cls: 0.1709  loss_box_reg: 0.1985  loss_rpn_cls: 0.000271  loss_rpn_loc: 0.004827    time: 0.3101  last_time: 0.2775  data_time: 0.0096  last_data_time: 0.0056   lr: 0.00025  max_mem: 1741M
[06/11 20:41:04 d2.utils.events]:  eta: 0:08:03  iter: 1439  total_loss: 0.4879  loss_cls: 0.192  loss_box_reg: 0.2562  loss_rpn_cls: 9.113e-05  loss_rpn_loc: 0.005959    time: 0.3101  last_time: 0.2678  data_time: 0.0056  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:41:10 d2.utils.events]:  eta: 0:07:57  iter: 1459  total_loss: 0.4775  loss_cls: 0.1964  loss_box_reg: 0.282  loss_rpn_cls: 0.0002932  loss_rpn_loc: 0.006341    time: 0.3103  last_time: 0.3177  data_time: 0.0068  last_data_time: 0.0072   lr: 0.00025  max_mem: 1741M
[06/11 20:41:16 d2.utils.events]:  eta: 0:07:51  iter: 1479  total_loss: 0.4415  loss_cls: 0.1834  loss_box_reg: 0.2194  loss_rpn_cls: 0.0002081  loss_rpn_loc: 0.004941    time: 0.3102  last_time: 0.2957  data_time: 0.0050  last_data_time: 0.0056   lr: 0.00025  max_mem: 1741M
[06/11 20:41:23 d2.utils.events]:  eta: 0:07:45  iter: 1499  total_loss: 0.3737  loss_cls: 0.1718  loss_box_reg: 0.1703  loss_rpn_cls: 5.463e-05  loss_rpn_loc: 0.004642    time: 0.3103  last_time: 0.3391  data_time: 0.0084  last_data_time: 0.0120   lr: 0.00025  max_mem: 1741M
[06/11 20:41:29 d2.utils.events]:  eta: 0:07:39  iter: 1519  total_loss: 0.3528  loss_cls: 0.1445  loss_box_reg: 0.2039  loss_rpn_cls: 0.0002279  loss_rpn_loc: 0.004593    time: 0.3104  last_time: 0.3434  data_time: 0.0052  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:41:36 d2.utils.events]:  eta: 0:07:33  iter: 1539  total_loss: 0.4302  loss_cls: 0.2025  loss_box_reg: 0.2201  loss_rpn_cls: 6.719e-05  loss_rpn_loc: 0.003882    time: 0.3105  last_time: 0.2859  data_time: 0.0053  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:41:42 d2.utils.events]:  eta: 0:07:27  iter: 1559  total_loss: 0.3478  loss_cls: 0.1456  loss_box_reg: 0.1734  loss_rpn_cls: 0.0001514  loss_rpn_loc: 0.005128    time: 0.3104  last_time: 0.3105  data_time: 0.0067  last_data_time: 0.0043   lr: 0.00025  max_mem: 1741M
[06/11 20:41:48 d2.utils.events]:  eta: 0:07:20  iter: 1579  total_loss: 0.384  loss_cls: 0.1774  loss_box_reg: 0.2152  loss_rpn_cls: 0.0009027  loss_rpn_loc: 0.006034    time: 0.3105  last_time: 0.3107  data_time: 0.0054  last_data_time: 0.0053   lr: 0.00025  max_mem: 1741M
[06/11 20:41:54 d2.utils.events]:  eta: 0:07:14  iter: 1599  total_loss: 0.4076  loss_cls: 0.1389  loss_box_reg: 0.2121  loss_rpn_cls: 0.0002629  loss_rpn_loc: 0.004774    time: 0.3105  last_time: 0.3431  data_time: 0.0078  last_data_time: 0.0044   lr: 0.00025  max_mem: 1741M
[06/11 20:42:00 d2.utils.events]:  eta: 0:07:08  iter: 1619  total_loss: 0.4158  loss_cls: 0.1702  loss_box_reg: 0.2417  loss_rpn_cls: 8.133e-05  loss_rpn_loc: 0.005207    time: 0.3104  last_time: 0.3391  data_time: 0.0052  last_data_time: 0.0060   lr: 0.00025  max_mem: 1741M
[06/11 20:42:07 d2.utils.events]:  eta: 0:07:02  iter: 1639  total_loss: 0.3269  loss_cls: 0.1558  loss_box_reg: 0.1751  loss_rpn_cls: 0.0002429  loss_rpn_loc: 0.006067    time: 0.3107  last_time: 0.3408  data_time: 0.0099  last_data_time: 0.0056   lr: 0.00025  max_mem: 1741M
[06/11 20:42:13 d2.utils.events]:  eta: 0:06:56  iter: 1659  total_loss: 0.4242  loss_cls: 0.1831  loss_box_reg: 0.2106  loss_rpn_cls: 5.939e-05  loss_rpn_loc: 0.003936    time: 0.3107  last_time: 0.3467  data_time: 0.0055  last_data_time: 0.0058   lr: 0.00025  max_mem: 1741M
[06/11 20:42:20 d2.utils.events]:  eta: 0:06:50  iter: 1679  total_loss: 0.345  loss_cls: 0.1388  loss_box_reg: 0.1944  loss_rpn_cls: 0.000132  loss_rpn_loc: 0.004608    time: 0.3108  last_time: 0.2642  data_time: 0.0068  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:42:26 d2.utils.events]:  eta: 0:06:44  iter: 1699  total_loss: 0.368  loss_cls: 0.1584  loss_box_reg: 0.1991  loss_rpn_cls: 0.0002118  loss_rpn_loc: 0.003883    time: 0.3108  last_time: 0.3517  data_time: 0.0051  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:42:32 d2.utils.events]:  eta: 0:06:38  iter: 1719  total_loss: 0.3562  loss_cls: 0.1319  loss_box_reg: 0.2149  loss_rpn_cls: 5.872e-05  loss_rpn_loc: 0.005738    time: 0.3110  last_time: 0.3377  data_time: 0.0075  last_data_time: 0.0042   lr: 0.00025  max_mem: 1741M
[06/11 20:42:39 d2.utils.events]:  eta: 0:06:31  iter: 1739  total_loss: 0.3949  loss_cls: 0.1426  loss_box_reg: 0.2097  loss_rpn_cls: 0.0001025  loss_rpn_loc: 0.005106    time: 0.3110  last_time: 0.3581  data_time: 0.0060  last_data_time: 0.0236   lr: 0.00025  max_mem: 1741M
[06/11 20:42:45 d2.utils.events]:  eta: 0:06:26  iter: 1759  total_loss: 0.3415  loss_cls: 0.1673  loss_box_reg: 0.199  loss_rpn_cls: 7.969e-05  loss_rpn_loc: 0.00473    time: 0.3111  last_time: 0.3320  data_time: 0.0065  last_data_time: 0.0058   lr: 0.00025  max_mem: 1741M
[06/11 20:42:51 d2.utils.events]:  eta: 0:06:19  iter: 1779  total_loss: 0.3343  loss_cls: 0.1455  loss_box_reg: 0.1982  loss_rpn_cls: 9.239e-05  loss_rpn_loc: 0.005232    time: 0.3110  last_time: 0.3468  data_time: 0.0056  last_data_time: 0.0044   lr: 0.00025  max_mem: 1741M
[06/11 20:42:58 d2.utils.events]:  eta: 0:06:13  iter: 1799  total_loss: 0.3343  loss_cls: 0.1206  loss_box_reg: 0.2023  loss_rpn_cls: 0.0001425  loss_rpn_loc: 0.004199    time: 0.3111  last_time: 0.3665  data_time: 0.0067  last_data_time: 0.0249   lr: 0.00025  max_mem: 1741M
[06/11 20:43:04 d2.utils.events]:  eta: 0:06:07  iter: 1819  total_loss: 0.3718  loss_cls: 0.1307  loss_box_reg: 0.2325  loss_rpn_cls: 0.0001615  loss_rpn_loc: 0.004393    time: 0.3111  last_time: 0.2699  data_time: 0.0057  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:43:10 d2.utils.events]:  eta: 0:06:01  iter: 1839  total_loss: 0.3074  loss_cls: 0.1255  loss_box_reg: 0.1466  loss_rpn_cls: 4.226e-05  loss_rpn_loc: 0.00325    time: 0.3111  last_time: 0.3429  data_time: 0.0075  last_data_time: 0.0245   lr: 0.00025  max_mem: 1741M
[06/11 20:43:16 d2.utils.events]:  eta: 0:05:55  iter: 1859  total_loss: 0.3296  loss_cls: 0.1258  loss_box_reg: 0.204  loss_rpn_cls: 0.000267  loss_rpn_loc: 0.004424    time: 0.3111  last_time: 0.3097  data_time: 0.0070  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:43:22 d2.utils.events]:  eta: 0:05:49  iter: 1879  total_loss: 0.3626  loss_cls: 0.1397  loss_box_reg: 0.2528  loss_rpn_cls: 0.0002565  loss_rpn_loc: 0.004212    time: 0.3110  last_time: 0.3071  data_time: 0.0058  last_data_time: 0.0045   lr: 0.00025  max_mem: 1741M
[06/11 20:43:28 d2.utils.events]:  eta: 0:05:42  iter: 1899  total_loss: 0.3294  loss_cls: 0.133  loss_box_reg: 0.2045  loss_rpn_cls: 8.698e-05  loss_rpn_loc: 0.004865    time: 0.3110  last_time: 0.3076  data_time: 0.0109  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:43:34 d2.utils.events]:  eta: 0:05:35  iter: 1919  total_loss: 0.3516  loss_cls: 0.1293  loss_box_reg: 0.207  loss_rpn_cls: 0.0001196  loss_rpn_loc: 0.004348    time: 0.3108  last_time: 0.3010  data_time: 0.0049  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:43:41 d2.utils.events]:  eta: 0:05:30  iter: 1939  total_loss: 0.2541  loss_cls: 0.09967  loss_box_reg: 0.1511  loss_rpn_cls: 0.0001239  loss_rpn_loc: 0.005369    time: 0.3110  last_time: 0.2790  data_time: 0.0098  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:43:47 d2.utils.events]:  eta: 0:05:23  iter: 1959  total_loss: 0.2939  loss_cls: 0.09826  loss_box_reg: 0.1528  loss_rpn_cls: 0.0001608  loss_rpn_loc: 0.004152    time: 0.3109  last_time: 0.3103  data_time: 0.0048  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:43:53 d2.utils.events]:  eta: 0:05:17  iter: 1979  total_loss: 0.3116  loss_cls: 0.11  loss_box_reg: 0.1904  loss_rpn_cls: 4.749e-05  loss_rpn_loc: 0.005977    time: 0.3110  last_time: 0.3483  data_time: 0.0101  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:44:00 d2.utils.events]:  eta: 0:05:10  iter: 1999  total_loss: 0.3077  loss_cls: 0.09759  loss_box_reg: 0.1768  loss_rpn_cls: 0.0004942  loss_rpn_loc: 0.005076    time: 0.3111  last_time: 0.3152  data_time: 0.0055  last_data_time: 0.0054   lr: 0.00025  max_mem: 1741M
[06/11 20:44:06 d2.utils.events]:  eta: 0:05:04  iter: 2019  total_loss: 0.3271  loss_cls: 0.0948  loss_box_reg: 0.2129  loss_rpn_cls: 4.647e-05  loss_rpn_loc: 0.004493    time: 0.3112  last_time: 0.2638  data_time: 0.0068  last_data_time: 0.0045   lr: 0.00025  max_mem: 1741M
[06/11 20:44:12 d2.utils.events]:  eta: 0:04:58  iter: 2039  total_loss: 0.272  loss_cls: 0.0865  loss_box_reg: 0.1585  loss_rpn_cls: 0.0001224  loss_rpn_loc: 0.003978    time: 0.3111  last_time: 0.3015  data_time: 0.0056  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:44:19 d2.utils.events]:  eta: 0:04:52  iter: 2059  total_loss: 0.3565  loss_cls: 0.1153  loss_box_reg: 0.2296  loss_rpn_cls: 0.0001185  loss_rpn_loc: 0.004301    time: 0.3112  last_time: 0.3231  data_time: 0.0092  last_data_time: 0.0140   lr: 0.00025  max_mem: 1741M
[06/11 20:44:25 d2.utils.events]:  eta: 0:04:46  iter: 2079  total_loss: 0.3089  loss_cls: 0.123  loss_box_reg: 0.2089  loss_rpn_cls: 0.0003092  loss_rpn_loc: 0.004962    time: 0.3112  last_time: 0.3442  data_time: 0.0058  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:44:31 d2.utils.events]:  eta: 0:04:40  iter: 2099  total_loss: 0.2896  loss_cls: 0.1348  loss_box_reg: 0.1612  loss_rpn_cls: 0.0001174  loss_rpn_loc: 0.004278    time: 0.3111  last_time: 0.3415  data_time: 0.0090  last_data_time: 0.0293   lr: 0.00025  max_mem: 1741M
[06/11 20:44:37 d2.utils.events]:  eta: 0:04:33  iter: 2119  total_loss: 0.3099  loss_cls: 0.08321  loss_box_reg: 0.2024  loss_rpn_cls: 0.0002978  loss_rpn_loc: 0.004513    time: 0.3111  last_time: 0.2741  data_time: 0.0071  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:44:44 d2.utils.events]:  eta: 0:04:28  iter: 2139  total_loss: 0.2988  loss_cls: 0.1076  loss_box_reg: 0.1559  loss_rpn_cls: 0.0003307  loss_rpn_loc: 0.005221    time: 0.3112  last_time: 0.3519  data_time: 0.0051  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:44:50 d2.utils.events]:  eta: 0:04:21  iter: 2159  total_loss: 0.2366  loss_cls: 0.09478  loss_box_reg: 0.1412  loss_rpn_cls: 0.0002482  loss_rpn_loc: 0.004178    time: 0.3113  last_time: 0.2769  data_time: 0.0059  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:44:56 d2.utils.events]:  eta: 0:04:15  iter: 2179  total_loss: 0.2739  loss_cls: 0.08571  loss_box_reg: 0.1459  loss_rpn_cls: 0.0001205  loss_rpn_loc: 0.003696    time: 0.3112  last_time: 0.3417  data_time: 0.0049  last_data_time: 0.0048   lr: 0.00025  max_mem: 1741M
[06/11 20:45:03 d2.utils.events]:  eta: 0:04:09  iter: 2199  total_loss: 0.2999  loss_cls: 0.1127  loss_box_reg: 0.1669  loss_rpn_cls: 4.948e-05  loss_rpn_loc: 0.004188    time: 0.3114  last_time: 0.3437  data_time: 0.0106  last_data_time: 0.0058   lr: 0.00025  max_mem: 1741M
[06/11 20:45:09 d2.utils.events]:  eta: 0:04:03  iter: 2219  total_loss: 0.2997  loss_cls: 0.108  loss_box_reg: 0.2004  loss_rpn_cls: 0.000184  loss_rpn_loc: 0.004596    time: 0.3113  last_time: 0.2944  data_time: 0.0049  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:45:15 d2.utils.events]:  eta: 0:03:57  iter: 2239  total_loss: 0.3049  loss_cls: 0.1085  loss_box_reg: 0.1568  loss_rpn_cls: 0.0001501  loss_rpn_loc: 0.004245    time: 0.3114  last_time: 0.3439  data_time: 0.0054  last_data_time: 0.0056   lr: 0.00025  max_mem: 1741M
[06/11 20:45:22 d2.utils.events]:  eta: 0:03:50  iter: 2259  total_loss: 0.327  loss_cls: 0.126  loss_box_reg: 0.2154  loss_rpn_cls: 8.438e-05  loss_rpn_loc: 0.003886    time: 0.3115  last_time: 0.3452  data_time: 0.0059  last_data_time: 0.0052   lr: 0.00025  max_mem: 1741M
[06/11 20:45:28 d2.utils.events]:  eta: 0:03:44  iter: 2279  total_loss: 0.2755  loss_cls: 0.1132  loss_box_reg: 0.1363  loss_rpn_cls: 6.122e-05  loss_rpn_loc: 0.003962    time: 0.3116  last_time: 0.2771  data_time: 0.0127  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:45:34 d2.utils.events]:  eta: 0:03:38  iter: 2299  total_loss: 0.2803  loss_cls: 0.07782  loss_box_reg: 0.1588  loss_rpn_cls: 0.0004556  loss_rpn_loc: 0.004669    time: 0.3116  last_time: 0.3488  data_time: 0.0051  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:45:41 d2.utils.events]:  eta: 0:03:32  iter: 2319  total_loss: 0.3177  loss_cls: 0.11  loss_box_reg: 0.2062  loss_rpn_cls: 0.0001161  loss_rpn_loc: 0.004855    time: 0.3116  last_time: 0.3094  data_time: 0.0086  last_data_time: 0.0278   lr: 0.00025  max_mem: 1741M
[06/11 20:45:47 d2.utils.events]:  eta: 0:03:26  iter: 2339  total_loss: 0.2608  loss_cls: 0.08468  loss_box_reg: 0.1548  loss_rpn_cls: 0.0001667  loss_rpn_loc: 0.004683    time: 0.3115  last_time: 0.2984  data_time: 0.0050  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:45:53 d2.utils.events]:  eta: 0:03:19  iter: 2359  total_loss: 0.2631  loss_cls: 0.07244  loss_box_reg: 0.1477  loss_rpn_cls: 7.27e-05  loss_rpn_loc: 0.003406    time: 0.3116  last_time: 0.3267  data_time: 0.0061  last_data_time: 0.0095   lr: 0.00025  max_mem: 1741M
[06/11 20:45:59 d2.utils.events]:  eta: 0:03:13  iter: 2379  total_loss: 0.2926  loss_cls: 0.08526  loss_box_reg: 0.2036  loss_rpn_cls: 9.775e-05  loss_rpn_loc: 0.004379    time: 0.3117  last_time: 0.2441  data_time: 0.0073  last_data_time: 0.0054   lr: 0.00025  max_mem: 1741M
[06/11 20:46:05 d2.utils.events]:  eta: 0:03:07  iter: 2399  total_loss: 0.2863  loss_cls: 0.09254  loss_box_reg: 0.1438  loss_rpn_cls: 0.0001555  loss_rpn_loc: 0.003814    time: 0.3115  last_time: 0.3192  data_time: 0.0055  last_data_time: 0.0071   lr: 0.00025  max_mem: 1741M
[06/11 20:46:12 d2.utils.events]:  eta: 0:03:01  iter: 2419  total_loss: 0.2795  loss_cls: 0.08783  loss_box_reg: 0.1602  loss_rpn_cls: 0.0001029  loss_rpn_loc: 0.006224    time: 0.3115  last_time: 0.2718  data_time: 0.0075  last_data_time: 0.0045   lr: 0.00025  max_mem: 1741M
[06/11 20:46:18 d2.utils.events]:  eta: 0:02:54  iter: 2439  total_loss: 0.2452  loss_cls: 0.09015  loss_box_reg: 0.1481  loss_rpn_cls: 2.676e-05  loss_rpn_loc: 0.003399    time: 0.3115  last_time: 0.3443  data_time: 0.0052  last_data_time: 0.0044   lr: 0.00025  max_mem: 1741M
[06/11 20:46:24 d2.utils.events]:  eta: 0:02:48  iter: 2459  total_loss: 0.2634  loss_cls: 0.07237  loss_box_reg: 0.1703  loss_rpn_cls: 0.0002175  loss_rpn_loc: 0.005518    time: 0.3116  last_time: 0.3466  data_time: 0.0090  last_data_time: 0.0052   lr: 0.00025  max_mem: 1741M
[06/11 20:46:30 d2.utils.events]:  eta: 0:02:42  iter: 2479  total_loss: 0.3637  loss_cls: 0.09114  loss_box_reg: 0.2166  loss_rpn_cls: 6.508e-05  loss_rpn_loc: 0.004147    time: 0.3115  last_time: 0.3110  data_time: 0.0050  last_data_time: 0.0053   lr: 0.00025  max_mem: 1741M
[06/11 20:46:37 d2.utils.events]:  eta: 0:02:36  iter: 2499  total_loss: 0.2938  loss_cls: 0.1161  loss_box_reg: 0.1812  loss_rpn_cls: 0.0001269  loss_rpn_loc: 0.00466    time: 0.3116  last_time: 0.3127  data_time: 0.0059  last_data_time: 0.0048   lr: 0.00025  max_mem: 1741M
[06/11 20:46:43 d2.utils.events]:  eta: 0:02:29  iter: 2519  total_loss: 0.2614  loss_cls: 0.08164  loss_box_reg: 0.1555  loss_rpn_cls: 4.377e-05  loss_rpn_loc: 0.003928    time: 0.3116  last_time: 0.3411  data_time: 0.0053  last_data_time: 0.0062   lr: 0.00025  max_mem: 1741M
[06/11 20:46:49 d2.utils.events]:  eta: 0:02:23  iter: 2539  total_loss: 0.2843  loss_cls: 0.0856  loss_box_reg: 0.1769  loss_rpn_cls: 0.0001089  loss_rpn_loc: 0.004282    time: 0.3116  last_time: 0.3072  data_time: 0.0105  last_data_time: 0.0049   lr: 0.00025  max_mem: 1741M
[06/11 20:46:56 d2.utils.events]:  eta: 0:02:17  iter: 2559  total_loss: 0.2404  loss_cls: 0.05729  loss_box_reg: 0.1346  loss_rpn_cls: 0.0001844  loss_rpn_loc: 0.004045    time: 0.3117  last_time: 0.3413  data_time: 0.0052  last_data_time: 0.0059   lr: 0.00025  max_mem: 1741M
[06/11 20:47:02 d2.utils.events]:  eta: 0:02:11  iter: 2579  total_loss: 0.2827  loss_cls: 0.08241  loss_box_reg: 0.1623  loss_rpn_cls: 0.0001833  loss_rpn_loc: 0.004964    time: 0.3117  last_time: 0.2849  data_time: 0.0070  last_data_time: 0.0052   lr: 0.00025  max_mem: 1741M
[06/11 20:47:08 d2.utils.events]:  eta: 0:02:04  iter: 2599  total_loss: 0.1806  loss_cls: 0.04925  loss_box_reg: 0.1209  loss_rpn_cls: 0.0001038  loss_rpn_loc: 0.003911    time: 0.3117  last_time: 0.3076  data_time: 0.0059  last_data_time: 0.0048   lr: 0.00025  max_mem: 1741M
[06/11 20:47:14 d2.utils.events]:  eta: 0:01:58  iter: 2619  total_loss: 0.2321  loss_cls: 0.08025  loss_box_reg: 0.1444  loss_rpn_cls: 0.0001534  loss_rpn_loc: 0.004038    time: 0.3117  last_time: 0.3474  data_time: 0.0071  last_data_time: 0.0214   lr: 0.00025  max_mem: 1741M
[06/11 20:47:21 d2.utils.events]:  eta: 0:01:52  iter: 2639  total_loss: 0.2764  loss_cls: 0.09412  loss_box_reg: 0.1555  loss_rpn_cls: 0.0002002  loss_rpn_loc: 0.00343    time: 0.3116  last_time: 0.3438  data_time: 0.0061  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:47:27 d2.utils.events]:  eta: 0:01:46  iter: 2659  total_loss: 0.2549  loss_cls: 0.07446  loss_box_reg: 0.1477  loss_rpn_cls: 0.0001097  loss_rpn_loc: 0.004094    time: 0.3117  last_time: 0.3290  data_time: 0.0067  last_data_time: 0.0160   lr: 0.00025  max_mem: 1741M
[06/11 20:47:33 d2.utils.events]:  eta: 0:01:39  iter: 2679  total_loss: 0.2102  loss_cls: 0.07973  loss_box_reg: 0.1381  loss_rpn_cls: 0.0001163  loss_rpn_loc: 0.003937    time: 0.3116  last_time: 0.3435  data_time: 0.0071  last_data_time: 0.0056   lr: 0.00025  max_mem: 1741M
[06/11 20:47:39 d2.utils.events]:  eta: 0:01:33  iter: 2699  total_loss: 0.2647  loss_cls: 0.07447  loss_box_reg: 0.1825  loss_rpn_cls: 0.0001543  loss_rpn_loc: 0.00506    time: 0.3116  last_time: 0.2663  data_time: 0.0055  last_data_time: 0.0065   lr: 0.00025  max_mem: 1741M
[06/11 20:47:46 d2.utils.events]:  eta: 0:01:27  iter: 2719  total_loss: 0.2728  loss_cls: 0.08463  loss_box_reg: 0.207  loss_rpn_cls: 8.083e-05  loss_rpn_loc: 0.004625    time: 0.3117  last_time: 0.2591  data_time: 0.0066  last_data_time: 0.0047   lr: 0.00025  max_mem: 1741M
[06/11 20:47:52 d2.utils.events]:  eta: 0:01:21  iter: 2739  total_loss: 0.2941  loss_cls: 0.0788  loss_box_reg: 0.219  loss_rpn_cls: 1.242e-05  loss_rpn_loc: 0.004094    time: 0.3116  last_time: 0.3075  data_time: 0.0050  last_data_time: 0.0052   lr: 0.00025  max_mem: 1741M
[06/11 20:47:58 d2.utils.events]:  eta: 0:01:14  iter: 2759  total_loss: 0.2062  loss_cls: 0.0559  loss_box_reg: 0.1343  loss_rpn_cls: 5.696e-05  loss_rpn_loc: 0.004084    time: 0.3117  last_time: 0.2744  data_time: 0.0090  last_data_time: 0.0052   lr: 0.00025  max_mem: 1741M
[06/11 20:48:04 d2.utils.events]:  eta: 0:01:08  iter: 2779  total_loss: 0.2186  loss_cls: 0.07095  loss_box_reg: 0.1375  loss_rpn_cls: 0.0001066  loss_rpn_loc: 0.004045    time: 0.3116  last_time: 0.3152  data_time: 0.0050  last_data_time: 0.0064   lr: 0.00025  max_mem: 1741M
[06/11 20:48:10 d2.utils.events]:  eta: 0:01:02  iter: 2799  total_loss: 0.2302  loss_cls: 0.08553  loss_box_reg: 0.1382  loss_rpn_cls: 0.0001705  loss_rpn_loc: 0.005347    time: 0.3117  last_time: 0.3518  data_time: 0.0087  last_data_time: 0.0042   lr: 0.00025  max_mem: 1741M
[06/11 20:48:16 d2.utils.events]:  eta: 0:00:56  iter: 2819  total_loss: 0.1827  loss_cls: 0.06127  loss_box_reg: 0.1317  loss_rpn_cls: 0.0005495  loss_rpn_loc: 0.003968    time: 0.3116  last_time: 0.3071  data_time: 0.0050  last_data_time: 0.0050   lr: 0.00025  max_mem: 1741M
[06/11 20:48:23 d2.utils.events]:  eta: 0:00:49  iter: 2839  total_loss: 0.22  loss_cls: 0.06739  loss_box_reg: 0.1506  loss_rpn_cls: 8.22e-05  loss_rpn_loc: 0.004879    time: 0.3116  last_time: 0.3323  data_time: 0.0067  last_data_time: 0.0169   lr: 0.00025  max_mem: 1741M
[06/11 20:48:29 d2.utils.events]:  eta: 0:00:43  iter: 2859  total_loss: 0.2654  loss_cls: 0.08714  loss_box_reg: 0.1493  loss_rpn_cls: 0.0001981  loss_rpn_loc: 0.005691    time: 0.3116  last_time: 0.2939  data_time: 0.0052  last_data_time: 0.0044   lr: 0.00025  max_mem: 1741M
[06/11 20:48:35 d2.utils.events]:  eta: 0:00:37  iter: 2879  total_loss: 0.2088  loss_cls: 0.05448  loss_box_reg: 0.1454  loss_rpn_cls: 6.847e-05  loss_rpn_loc: 0.004805    time: 0.3116  last_time: 0.3389  data_time: 0.0065  last_data_time: 0.0059   lr: 0.00025  max_mem: 1741M
[06/11 20:48:42 d2.utils.events]:  eta: 0:00:31  iter: 2899  total_loss: 0.2073  loss_cls: 0.05473  loss_box_reg: 0.1491  loss_rpn_cls: 0.0001279  loss_rpn_loc: 0.003445    time: 0.3116  last_time: 0.2976  data_time: 0.0065  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:48:48 d2.utils.events]:  eta: 0:00:24  iter: 2919  total_loss: 0.2802  loss_cls: 0.08243  loss_box_reg: 0.1698  loss_rpn_cls: 0.0001012  loss_rpn_loc: 0.004326    time: 0.3116  last_time: 0.3062  data_time: 0.0057  last_data_time: 0.0098   lr: 0.00025  max_mem: 1741M
[06/11 20:48:54 d2.utils.events]:  eta: 0:00:18  iter: 2939  total_loss: 0.243  loss_cls: 0.05776  loss_box_reg: 0.1602  loss_rpn_cls: 0.0002263  loss_rpn_loc: 0.004309    time: 0.3116  last_time: 0.2948  data_time: 0.0077  last_data_time: 0.0046   lr: 0.00025  max_mem: 1741M
[06/11 20:49:00 d2.utils.events]:  eta: 0:00:12  iter: 2959  total_loss: 0.2072  loss_cls: 0.05231  loss_box_reg: 0.1384  loss_rpn_cls: 0.0001004  loss_rpn_loc: 0.003886    time: 0.3116  last_time: 0.3096  data_time: 0.0051  last_data_time: 0.0056   lr: 0.00025  max_mem: 1741M
[06/11 20:49:07 d2.utils.events]:  eta: 0:00:06  iter: 2979  total_loss: 0.24  loss_cls: 0.06452  loss_box_reg: 0.1741  loss_rpn_cls: 6.263e-05  loss_rpn_loc: 0.004615    time: 0.3117  last_time: 0.3103  data_time: 0.0081  last_data_time: 0.0067   lr: 0.00025  max_mem: 1741M
[06/11 20:49:14 d2.utils.events]:  eta: 0:00:00  iter: 2999  total_loss: 0.1809  loss_cls: 0.05744  loss_box_reg: 0.1319  loss_rpn_cls: 9.874e-05  loss_rpn_loc: 0.003562    time: 0.3116  last_time: 0.2604  data_time: 0.0049  last_data_time: 0.0051   lr: 0.00025  max_mem: 1741M
[06/11 20:49:14 d2.engine.hooks]: Overall training speed: 2998 iterations in 0:15:34 (0.3116 s / it)
[06/11 20:49:14 d2.engine.hooks]: Total training time: 0:15:41 (0:00:06 on hooks)
WARNING [06/11 20:49:14 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 20:49:14 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
[06/11 20:49:14 d2.data.build]: Distribution of instances among all 13 categories:
|   category    | #instances   |  category   | #instances   |   category    | #instances   |
|:-------------:|:-------------|:-----------:|:-------------|:-------------:|:-------------|
| food-detect.. | 0            |  apple_pie  | 15           | chocolate_c.. | 16           |
| french_fries  | 15           |   hot_dog   | 15           |   ice_cream   | 17           |
|    nachos     | 15           | onion_rings | 15           |   pancakes    | 15           |
|     pizza     | 15           |   ravioli   | 15           |    samosa     | 15           |
| spring_rolls  | 15           |             |              |               |              |
|     total     | 183          |             |              |               |              |
[06/11 20:49:14 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/11 20:49:14 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 20:49:14 d2.data.common]: Serializing 180 elements to byte tensors and concatenating them all ...
[06/11 20:49:14 d2.data.common]: Serialized dataset takes 0.06 MiB
WARNING [06/11 20:49:14 d2.engine.defaults]: No evaluator found. Use `DefaultTrainer.test(evaluators=)`, or implement its `build_evaluator` method.
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(

Model EvaluationΒΆ

InΒ [Β ]:
# Evaluate the model
cfg_fast_rcnn.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Reduce threshold
test_loader = build_detection_test_loader(cfg_fast_rcnn, "food_val")
evaluator = COCOEvaluator("food_val", cfg_fast_rcnn, False, output_dir=cfg_fast_rcnn.OUTPUT_DIR)
trainer.test(cfg_fast_rcnn, trainer.model, evaluators=[evaluator])


predictor = DefaultPredictor(cfg_fast_rcnn)
WARNING [06/11 20:52:20 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 20:52:20 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
[06/11 20:52:20 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/11 20:52:20 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 20:52:20 d2.data.common]: Serializing 180 elements to byte tensors and concatenating them all ...
[06/11 20:52:20 d2.data.common]: Serialized dataset takes 0.06 MiB
WARNING [06/11 20:52:20 d2.evaluation.coco_evaluation]: COCO Evaluator instantiated using config, this is deprecated behavior. Please pass in explicit arguments instead.
WARNING [06/11 20:52:20 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 20:52:20 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
[06/11 20:52:20 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/11 20:52:20 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 20:52:20 d2.data.common]: Serializing 180 elements to byte tensors and concatenating them all ...
[06/11 20:52:20 d2.data.common]: Serialized dataset takes 0.06 MiB
[06/11 20:52:20 d2.evaluation.evaluator]: Start inference on 180 batches
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
[06/11 20:52:23 d2.evaluation.evaluator]: Inference done 11/180. Dataloading: 0.0022 s/iter. Inference: 0.0901 s/iter. Eval: 0.0003 s/iter. Total: 0.0926 s/iter. ETA=0:00:15
[06/11 20:52:28 d2.evaluation.evaluator]: Inference done 70/180. Dataloading: 0.0017 s/iter. Inference: 0.0841 s/iter. Eval: 0.0003 s/iter. Total: 0.0861 s/iter. ETA=0:00:09
[06/11 20:52:33 d2.evaluation.evaluator]: Inference done 128/180. Dataloading: 0.0017 s/iter. Inference: 0.0842 s/iter. Eval: 0.0003 s/iter. Total: 0.0863 s/iter. ETA=0:00:04
[06/11 20:52:38 d2.evaluation.evaluator]: Total inference time: 0:00:15.651795 (0.089439 s / iter per device, on 1 devices)
[06/11 20:52:38 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:14 (0.085554 s / iter per device, on 1 devices)
[06/11 20:52:38 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[06/11 20:52:38 d2.evaluation.coco_evaluation]: Saving results to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_faster_rcnn/coco_instances_results.json
[06/11 20:52:38 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[06/11 20:52:38 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[06/11 20:52:38 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.03 seconds.
[06/11 20:52:38 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[06/11 20:52:38 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.03 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.381
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.541
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.445
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.381
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.458
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.465
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.465
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.465
[06/11 20:52:38 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs  |  APm  |  APl   |
|:------:|:------:|:------:|:-----:|:-----:|:------:|
| 38.054 | 54.141 | 44.514 |  nan  |  nan  | 38.054 |
[06/11 20:52:38 d2.evaluation.coco_evaluation]: Some metrics cannot be computed and is shown as NaN.
[06/11 20:52:38 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category       | AP     | category    | AP     | category       | AP     |
|:---------------|:-------|:------------|:-------|:---------------|:-------|
| food-detection | nan    | apple_pie   | 8.097  | chocolate_cake | 45.474 |
| french_fries   | 46.566 | hot_dog     | 38.375 | ice_cream      | 28.436 |
| nachos         | 53.299 | onion_rings | 55.595 | pancakes       | 42.663 |
| pizza          | 73.357 | ravioli     | 28.203 | samosa         | 15.531 |
| spring_rolls   | 21.055 |             |        |                |        |
[06/11 20:52:38 d2.engine.defaults]: Evaluation results for food_val in csv format:
[06/11 20:52:38 d2.evaluation.testing]: copypaste: Task: bbox
[06/11 20:52:38 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[06/11 20:52:38 d2.evaluation.testing]: copypaste: 38.0543,54.1414,44.5139,nan,nan,38.0543
[06/11 20:52:39 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl ...
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (14, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (14,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (52, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (52,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}

Model MetricsΒΆ

InΒ [Β ]:
import json
import matplotlib.pyplot as plt

# Path to evaluation results
metrics_path_faster_rcnn = captsone_project_path + "output_food_detection_faster_rcnn/metrics.json"

# Load metrics from the NDJSON evaluation results
def load_metrics_ndjson(metrics_path):
    metrics = []
    with open(metrics_path, "r") as f:
        for line in f:
            metrics.append(json.loads(line))  # Parse each line as a JSON object
    return metrics


# Plot metrics
def plot_metrics_ndjson(metrics):
    # Extract iterations and losses
    iterations = [m['iteration'] for m in metrics]
    total_loss = [m['total_loss'] for m in metrics]
    loss_box_reg = [m['loss_box_reg'] for m in metrics]
    loss_cls = [m['loss_cls'] for m in metrics]
    fg_cls_accuracy = [m['fast_rcnn/fg_cls_accuracy'] for m in metrics]
    cls_accuracy = [m['fast_rcnn/cls_accuracy'] for m in metrics]
    false_negative = [m['fast_rcnn/false_negative'] for m in metrics]


    # Plot total loss over iterations
    plt.figure(figsize=(10, 6))
    plt.plot(iterations, total_loss, label="Total Loss", marker="o")
    plt.plot(iterations, loss_box_reg, label="Box Regression Loss", marker="o")
    plt.plot(iterations, loss_cls, label="Classification Loss", marker="o")
    plt.plot(iterations, fg_cls_accuracy, label="Foreground Classification Accuracy", marker="o")
    plt.plot(iterations, cls_accuracy, label="Overall Classification Accuracy", marker="o")
    plt.plot(iterations, false_negative, label="False Negative Rate", marker="o")
    plt.xlabel("Iteration")
    plt.ylabel("Loss")
    plt.title("Training Metrics Over Iterations")
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.show()

# Load and showcase metrics
metrics_ndjson = load_metrics_ndjson(metrics_path_faster_rcnn)
plot_metrics_ndjson(metrics_ndjson)
No description has been provided for this image

ObservationsΒΆ

  • Overall Classification remained stable throught the iterations
  • Overall loss decreased throughout the iterations
  • As we increment the iterations the Classification Accuracy has improved
InΒ [Β ]:
# Run inference and evaluation
results = inference_on_dataset(trainer.model, test_loader, evaluator)
[06/11 20:53:33 d2.evaluation.evaluator]: Start inference on 180 batches
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
[06/11 20:53:34 d2.evaluation.evaluator]: Inference done 11/180. Dataloading: 0.0013 s/iter. Inference: 0.0825 s/iter. Eval: 0.0002 s/iter. Total: 0.0841 s/iter. ETA=0:00:14
[06/11 20:53:39 d2.evaluation.evaluator]: Inference done 66/180. Dataloading: 0.0027 s/iter. Inference: 0.0876 s/iter. Eval: 0.0003 s/iter. Total: 0.0907 s/iter. ETA=0:00:10
[06/11 20:53:44 d2.evaluation.evaluator]: Inference done 124/180. Dataloading: 0.0021 s/iter. Inference: 0.0860 s/iter. Eval: 0.0003 s/iter. Total: 0.0885 s/iter. ETA=0:00:04
[06/11 20:53:49 d2.evaluation.evaluator]: Total inference time: 0:00:15.493481 (0.088534 s / iter per device, on 1 devices)
[06/11 20:53:49 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:15 (0.085790 s / iter per device, on 1 devices)
[06/11 20:53:49 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[06/11 20:53:49 d2.evaluation.coco_evaluation]: Saving results to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_faster_rcnn/coco_instances_results.json
[06/11 20:53:49 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[06/11 20:53:49 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[06/11 20:53:49 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.03 seconds.
[06/11 20:53:49 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[06/11 20:53:49 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.03 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.381
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.541
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.445
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.381
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.458
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.465
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.465
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.465
[06/11 20:53:49 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs  |  APm  |  APl   |
|:------:|:------:|:------:|:-----:|:-----:|:------:|
| 38.054 | 54.141 | 44.514 |  nan  |  nan  | 38.054 |
[06/11 20:53:49 d2.evaluation.coco_evaluation]: Some metrics cannot be computed and is shown as NaN.
[06/11 20:53:49 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category       | AP     | category    | AP     | category       | AP     |
|:---------------|:-------|:------------|:-------|:---------------|:-------|
| food-detection | nan    | apple_pie   | 8.097  | chocolate_cake | 45.474 |
| french_fries   | 46.566 | hot_dog     | 38.375 | ice_cream      | 28.436 |
| nachos         | 53.299 | onion_rings | 55.595 | pancakes       | 42.663 |
| pizza          | 73.357 | ravioli     | 28.203 | samosa         | 15.531 |
| spring_rolls   | 21.055 |             |        |                |        |

Model Validation PerformanceΒΆ

InΒ [Β ]:
print("Evaluation Results:")
for key, value in results.items():
    if isinstance(value, dict):  # Handle nested dictionaries
        print(f"{key}:")
        for sub_key, sub_value in value.items():
            print(f"  {sub_key}: {sub_value:.4f}" if isinstance(sub_value, (int, float)) else f"  {sub_key}: {sub_value}")
    else:
        print(f"{key}: {value:.4f}" if isinstance(value, (int, float)) else f"{key}: {value}")
Evaluation Results:
bbox:
  AP: 38.0543
  AP50: 54.1414
  AP75: 44.5139
  APs: nan
  APm: nan
  APl: 38.0543
  AP-food-detection: nan
  AP-apple_pie: 8.0968
  AP-chocolate_cake: 45.4741
  AP-french_fries: 46.5665
  AP-hot_dog: 38.3747
  AP-ice_cream: 28.4363
  AP-nachos: 53.2992
  AP-onion_rings: 55.5949
  AP-pancakes: 42.6633
  AP-pizza: 73.3566
  AP-ravioli: 28.2030
  AP-samosa: 15.5312
  AP-spring_rolls: 21.0552
InΒ [Β ]:
flattened_results = {}
for key, value in results.items():
    if isinstance(value, dict):  # Handle nested dictionaries
        for sub_key, sub_value in value.items():
            flattened_results[f"{key}_{sub_key}"] = sub_value
    else:
        flattened_results[key] = value

# Extract keys and values for plotting
keys = list(flattened_results.keys())
values = [flattened_results[key] for key in keys]

# Create a bar chart
plt.figure(figsize=(10, 6))
bars = plt.bar(keys, values, color="skyblue")

# Add percentage labels on top of the bars
for bar in bars:
    height = bar.get_height()
    plt.text(
        bar.get_x() + bar.get_width() / 2,  # X-coordinate
        height + 0.01,  # Y-coordinate (slightly above the bar)
        f"{height :.2f}%",  # Format as percentage
        ha="center",  # Horizontal alignment
        va="bottom",  # Vertical alignment
        fontsize=10,  # Font size
        color="black"  # Text color
    )
plt.title("COCO Evaluation Metrics")
plt.xlabel("Metric")
plt.ylabel("Value")
plt.xticks(rotation=90)
plt.tight_layout()
plt.show()
No description has been provided for this image

ObservationΒΆ

  • Overall Bounding Box Average Precision is 38.05%%
  • Bounding Box Average Precision is highest for pizza 73.36%, followed by onion_rings 55.59%, nachos 53.30%, french_fries 46.47% and choclate cake 45.47%
  • Classes like apple pie with Average Precision 8.10%, samosa 15.53%, spring_rolls 21.06% were lowest performing.
InΒ [Β ]:
# Load Predicted annotated file generated by above predictor
results_path = captsone_project_path + "output_food_detection_faster_rcnn/coco_instances_results.json"

# Load the predictions from the JSON file
with open(results_path, "r") as f:
    predictions = json.load(f)

Visualising the Model PredictionsΒΆ

InΒ [Β ]:
# Create a tabular visualization using the predictions
def visualize_output_in_table(random_images, predictions, metadata):
    num_images = len(random_images)
    fig, axes = plt.subplots(num_images, 3, figsize=(15, 5 * num_images))

    for i, d in enumerate(random_images):
        # Load the image
        image_path = d["file_name"]
        image = cv2.imread(image_path)
        image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        # Original Image
        axes[i, 0].imshow(image_rgb)
        axes[i, 0].set_title("Original Image")
        axes[i, 0].axis("off")

        # Ground Truth Image
        visualizer_gt = Visualizer(image_rgb, metadata=metadata, scale=0.5)
        vis_gt = visualizer_gt.draw_dataset_dict(d)
        axes[i, 1].imshow(vis_gt.get_image())
        axes[i, 1].set_title("Ground Truth")
        axes[i, 1].axis("off")

        # Predicted Image
        # Filter predictions for the current image
        image_id = d["image_id"]
        image_predictions = [p for p in predictions if p["image_id"] == image_id]

        # Draw predicted bounding boxes
        visualizer_pred = Visualizer(image_rgb, metadata=metadata, scale=0.5)
        for pred in image_predictions:
            bbox = pred["bbox"]
            category_id = pred["category_id"]
            class_name = metadata.thing_classes[category_id]  # Get class name from category_id
            score = pred["score"]

            # Draw bounding box
            x, y, w, h = bbox
            cv2.rectangle(image_rgb, (int(x), int(y)), (int(x + w), int(y + h)), (255, 0, 0), 2)

            # Draw label inside the bounding box
            label = f"{class_name}: {score:.2f}"
            label_size = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)[0]
            label_x = int(x) + 5  # Add padding inside the box
            label_y = int(y) + label_size[1] + 5  # Position label inside the box
            cv2.rectangle(image_rgb, (label_x - 2, label_y - label_size[1] - 2), (label_x + label_size[0] + 2, label_y + 2), (255, 0, 0), -1)  # Background for label
            cv2.putText(image_rgb, label, (label_x, label_y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)

        axes[i, 2].imshow(image_rgb)
        axes[i, 2].set_title("Predicted Bounding Boxes")
        axes[i, 2].axis("off")

    plt.tight_layout()
    plt.show()

# Visualize the output in tabular format
dataset_dicts = DatasetCatalog.get("food_val")
WARNING [06/11 21:29:58 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 21:29:58 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
InΒ [Β ]:
random_images = random.sample(dataset_dicts, 5)  # Select 5 random images
visualize_output_in_table(random_images, predictions, food_metadata)
No description has been provided for this image

Pickle Model for future predictionsΒΆ

InΒ [Β ]:
import pickle
import torch

# Save the trained model weights and configuration
def save_model(trainer, cfg, output_dir):
    # Save model weights
    model_weights_path = os.path.join(output_dir, "model_final.pth")
    torch.save(trainer.model.state_dict(), model_weights_path)
    print(f"Model weights saved to {model_weights_path}")

    # Save configuration
    config_path = os.path.join(output_dir, "config.pkl")
    with open(config_path, "wb") as f:
        pickle.dump(cfg, f)
    print(f"Model configuration saved to {config_path}")

# Load the model weights and configuration for future predictions
def load_model(output_dir):
    # Load configuration
    config_path = os.path.join(output_dir, "config.pkl")
    with open(config_path, "rb") as f:
        cfg = pickle.load(f)
    print(f"Model configuration loaded from {config_path}")

    # Load model weights
    model_weights_path = os.path.join(output_dir, "model_final.pth")
    cfg.MODEL.WEIGHTS = model_weights_path
    print(f"Model weights loaded from {model_weights_path}")

    # Create predictor
    predictor = DefaultPredictor(cfg)
    return predictor,cfg

# Save the model
save_model(trainer, cfg_fast_rcnn, cfg_fast_rcnn.OUTPUT_DIR)
Model weights saved to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_faster_rcnn/model_final.pth
Model configuration saved to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_faster_rcnn/config.pkl

Mask RCNN ModelΒΆ

InΒ [Β ]:
# Configure the Mask R-CNN model for object detection
cfg_mask_rcnn = get_cfg()
cfg_mask_rcnn.merge_from_file(
    model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
)  # Use Mask R-CNN backbone for object detection
cfg_mask_rcnn.DATASETS.TRAIN = ("food_train",)
cfg_mask_rcnn.DATASETS.TEST = ("food_val",)
cfg_mask_rcnn.DATALOADER.NUM_WORKERS = 4
cfg_mask_rcnn.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(
    "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
)  # Pretrained weights
cfg_mask_rcnn.SOLVER.IMS_PER_BATCH = 2
cfg_mask_rcnn.SOLVER.BASE_LR = 0.00025  # Learning rate
cfg_mask_rcnn.SOLVER.MAX_ITER = 3000  # Adjust based on dataset size
cfg_mask_rcnn.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128
cfg_mask_rcnn.MODEL.ROI_HEADS.NUM_CLASSES = len(food_metadata.thing_classes)  # Number of classes
cfg_mask_rcnn.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Threshold for predictions

# Disable mask prediction (only bounding box detection)
cfg_mask_rcnn.MODEL.MASK_ON = False

# Output directory
cfg_mask_rcnn.OUTPUT_DIR = captsone_project_path + "output_food_detection_mask_rcnn"
os.makedirs(cfg_mask_rcnn.OUTPUT_DIR, exist_ok=True)
InΒ [Β ]:
# Train the model
trainer = DefaultTrainer(cfg_mask_rcnn)
trainer.resume_or_load(resume=False)
trainer.train()
[06/11 20:54:17 d2.engine.defaults]: Model:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
      )
      (res2): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv1): Conv2d(
            64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv2): Conv2d(
            64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv3): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv2): Conv2d(
            64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv3): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv2): Conv2d(
            64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
          )
          (conv3): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
      )
      (res3): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv1): Conv2d(
            256, 128, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (3): BottleneckBlock(
          (conv1): Conv2d(
            512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv2): Conv2d(
            128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
          )
          (conv3): Conv2d(
            128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
      )
      (res4): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv1): Conv2d(
            512, 256, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (3): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (4): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (5): BottleneckBlock(
          (conv1): Conv2d(
            1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
      )
      (res5): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv1): Conv2d(
            1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
      )
    )
  )
  (proposal_generator): RPN(
    (rpn_head): StandardRPNHead(
      (conv): Conv2d(
        256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)
        (activation): ReLU()
      )
      (objectness_logits): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))
      (anchor_deltas): Conv2d(256, 12, kernel_size=(1, 1), stride=(1, 1))
    )
    (anchor_generator): DefaultAnchorGenerator(
      (cell_anchors): BufferList()
    )
  )
  (roi_heads): StandardROIHeads(
    (box_pooler): ROIPooler(
      (level_poolers): ModuleList(
        (0): ROIAlign(output_size=(7, 7), spatial_scale=0.25, sampling_ratio=0, aligned=True)
        (1): ROIAlign(output_size=(7, 7), spatial_scale=0.125, sampling_ratio=0, aligned=True)
        (2): ROIAlign(output_size=(7, 7), spatial_scale=0.0625, sampling_ratio=0, aligned=True)
        (3): ROIAlign(output_size=(7, 7), spatial_scale=0.03125, sampling_ratio=0, aligned=True)
      )
    )
    (box_head): FastRCNNConvFCHead(
      (flatten): Flatten(start_dim=1, end_dim=-1)
      (fc1): Linear(in_features=12544, out_features=1024, bias=True)
      (fc_relu1): ReLU()
      (fc2): Linear(in_features=1024, out_features=1024, bias=True)
      (fc_relu2): ReLU()
    )
    (box_predictor): FastRCNNOutputLayers(
      (cls_score): Linear(in_features=1024, out_features=14, bias=True)
      (bbox_pred): Linear(in_features=1024, out_features=52, bias=True)
    )
  )
)
WARNING [06/11 20:54:17 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 20:54:17 d2.data.datasets.coco]: Loaded 420 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/train/_annotations.coco.json
[06/11 20:54:17 d2.data.build]: Removed 1 images with no usable annotations. 419 images left.
[06/11 20:54:17 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[06/11 20:54:17 d2.data.build]: Using training sampler TrainingSampler
[06/11 20:54:17 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 20:54:17 d2.data.common]: Serializing 419 elements to byte tensors and concatenating them all ...
[06/11 20:54:17 d2.data.common]: Serialized dataset takes 0.14 MiB
[06/11 20:54:17 d2.data.build]: Making batched data loader with batch_size=2
WARNING [06/11 20:54:17 d2.solver.build]: SOLVER.STEPS contains values larger than SOLVER.MAX_ITER. These values will be ignored.
[06/11 20:54:17 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl ...
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
model_final_f10217.pkl: 178MB [00:03, 50.4MB/s]                           
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (14, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (14,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (52, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (52,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
WARNING:fvcore.common.checkpoint:The checkpoint state_dict contains keys that are not used by the model:
  roi_heads.mask_head.mask_fcn1.{bias, weight}
  roi_heads.mask_head.mask_fcn2.{bias, weight}
  roi_heads.mask_head.mask_fcn3.{bias, weight}
  roi_heads.mask_head.mask_fcn4.{bias, weight}
  roi_heads.mask_head.deconv.{bias, weight}
  roi_heads.mask_head.predictor.{bias, weight}
[06/11 20:54:21 d2.engine.train_loop]: Starting training from iteration 0
[06/11 20:54:28 d2.utils.events]:  eta: 0:16:25  iter: 19  total_loss: 2.985  loss_cls: 2.486  loss_box_reg: 0.4614  loss_rpn_cls: 0.009444  loss_rpn_loc: 0.007784    time: 0.3451  last_time: 0.7432  data_time: 0.0438  last_data_time: 0.3993   lr: 4.9953e-06  max_mem: 2216M
[06/11 20:54:35 d2.utils.events]:  eta: 0:16:45  iter: 39  total_loss: 2.837  loss_cls: 2.364  loss_box_reg: 0.4569  loss_rpn_cls: 0.00724  loss_rpn_loc: 0.00707    time: 0.3505  last_time: 0.2865  data_time: 0.0167  last_data_time: 0.0060   lr: 9.9902e-06  max_mem: 2216M
[06/11 20:54:42 d2.utils.events]:  eta: 0:15:56  iter: 59  total_loss: 2.483  loss_cls: 2.006  loss_box_reg: 0.4873  loss_rpn_cls: 0.0132  loss_rpn_loc: 0.005357    time: 0.3424  last_time: 0.3179  data_time: 0.0069  last_data_time: 0.0051   lr: 1.4985e-05  max_mem: 2216M
[06/11 20:54:49 d2.utils.events]:  eta: 0:16:31  iter: 79  total_loss: 2.054  loss_cls: 1.62  loss_box_reg: 0.4375  loss_rpn_cls: 0.006243  loss_rpn_loc: 0.007431    time: 0.3511  last_time: 0.2774  data_time: 0.0139  last_data_time: 0.0156   lr: 1.998e-05  max_mem: 2216M
[06/11 20:54:56 d2.utils.events]:  eta: 0:16:18  iter: 99  total_loss: 1.571  loss_cls: 1.102  loss_box_reg: 0.4613  loss_rpn_cls: 0.006367  loss_rpn_loc: 0.006682    time: 0.3475  last_time: 0.2820  data_time: 0.0079  last_data_time: 0.0055   lr: 2.4975e-05  max_mem: 2216M
[06/11 20:55:02 d2.utils.events]:  eta: 0:15:49  iter: 119  total_loss: 1.359  loss_cls: 0.7457  loss_box_reg: 0.5642  loss_rpn_cls: 0.006442  loss_rpn_loc: 0.006062    time: 0.3413  last_time: 0.2732  data_time: 0.0066  last_data_time: 0.0049   lr: 2.997e-05  max_mem: 2216M
[06/11 20:55:08 d2.utils.events]:  eta: 0:15:29  iter: 139  total_loss: 0.9789  loss_cls: 0.5645  loss_box_reg: 0.4209  loss_rpn_cls: 0.009716  loss_rpn_loc: 0.006527    time: 0.3364  last_time: 0.3663  data_time: 0.0084  last_data_time: 0.0099   lr: 3.4965e-05  max_mem: 2217M
[06/11 20:55:15 d2.utils.events]:  eta: 0:15:45  iter: 159  total_loss: 0.9979  loss_cls: 0.5118  loss_box_reg: 0.4693  loss_rpn_cls: 0.005858  loss_rpn_loc: 0.007056    time: 0.3395  last_time: 0.2696  data_time: 0.0119  last_data_time: 0.0312   lr: 3.996e-05  max_mem: 2217M
[06/11 20:55:22 d2.utils.events]:  eta: 0:15:26  iter: 179  total_loss: 1.002  loss_cls: 0.5133  loss_box_reg: 0.4758  loss_rpn_cls: 0.001645  loss_rpn_loc: 0.005445    time: 0.3372  last_time: 0.2337  data_time: 0.0098  last_data_time: 0.0056   lr: 4.4955e-05  max_mem: 2217M
[06/11 20:55:28 d2.utils.events]:  eta: 0:15:20  iter: 199  total_loss: 0.9354  loss_cls: 0.4711  loss_box_reg: 0.4376  loss_rpn_cls: 0.003817  loss_rpn_loc: 0.008016    time: 0.3358  last_time: 0.3458  data_time: 0.0056  last_data_time: 0.0052   lr: 4.995e-05  max_mem: 2217M
[06/11 20:55:34 d2.utils.events]:  eta: 0:15:07  iter: 219  total_loss: 0.9419  loss_cls: 0.468  loss_box_reg: 0.4665  loss_rpn_cls: 0.004255  loss_rpn_loc: 0.005762    time: 0.3330  last_time: 0.3054  data_time: 0.0054  last_data_time: 0.0054   lr: 5.4945e-05  max_mem: 2217M
[06/11 20:55:41 d2.utils.events]:  eta: 0:15:13  iter: 239  total_loss: 0.9717  loss_cls: 0.4775  loss_box_reg: 0.4795  loss_rpn_cls: 0.003114  loss_rpn_loc: 0.005759    time: 0.3341  last_time: 0.3688  data_time: 0.0147  last_data_time: 0.0203   lr: 5.994e-05  max_mem: 2217M
[06/11 20:55:48 d2.utils.events]:  eta: 0:15:05  iter: 259  total_loss: 0.9833  loss_cls: 0.4944  loss_box_reg: 0.4722  loss_rpn_cls: 0.004627  loss_rpn_loc: 0.006776    time: 0.3338  last_time: 0.3912  data_time: 0.0124  last_data_time: 0.0190   lr: 6.4935e-05  max_mem: 2217M
[06/11 20:55:56 d2.utils.events]:  eta: 0:15:07  iter: 279  total_loss: 1.037  loss_cls: 0.4994  loss_box_reg: 0.5245  loss_rpn_cls: 0.004767  loss_rpn_loc: 0.006686    time: 0.3378  last_time: 0.3189  data_time: 0.0168  last_data_time: 0.0253   lr: 6.993e-05  max_mem: 2217M
[06/11 20:56:02 d2.utils.events]:  eta: 0:14:56  iter: 299  total_loss: 0.8221  loss_cls: 0.4234  loss_box_reg: 0.4039  loss_rpn_cls: 0.001278  loss_rpn_loc: 0.00657    time: 0.3362  last_time: 0.3447  data_time: 0.0058  last_data_time: 0.0052   lr: 7.4925e-05  max_mem: 2217M
[06/11 20:56:09 d2.utils.events]:  eta: 0:14:55  iter: 319  total_loss: 0.838  loss_cls: 0.4101  loss_box_reg: 0.3985  loss_rpn_cls: 0.003004  loss_rpn_loc: 0.005538    time: 0.3372  last_time: 0.3711  data_time: 0.0165  last_data_time: 0.0242   lr: 7.992e-05  max_mem: 2217M
[06/11 20:56:15 d2.utils.events]:  eta: 0:14:43  iter: 339  total_loss: 0.8658  loss_cls: 0.4267  loss_box_reg: 0.4613  loss_rpn_cls: 0.0007839  loss_rpn_loc: 0.005183    time: 0.3352  last_time: 0.2775  data_time: 0.0055  last_data_time: 0.0049   lr: 8.4915e-05  max_mem: 2217M
[06/11 20:56:22 d2.utils.events]:  eta: 0:14:36  iter: 359  total_loss: 0.7838  loss_cls: 0.39  loss_box_reg: 0.4057  loss_rpn_cls: 0.001025  loss_rpn_loc: 0.00504    time: 0.3345  last_time: 0.3540  data_time: 0.0103  last_data_time: 0.0270   lr: 8.991e-05  max_mem: 2217M
[06/11 20:56:28 d2.utils.events]:  eta: 0:14:25  iter: 379  total_loss: 0.7069  loss_cls: 0.3318  loss_box_reg: 0.3681  loss_rpn_cls: 0.001533  loss_rpn_loc: 0.005993    time: 0.3335  last_time: 0.3470  data_time: 0.0067  last_data_time: 0.0173   lr: 9.4905e-05  max_mem: 2217M
[06/11 20:56:35 d2.utils.events]:  eta: 0:14:23  iter: 399  total_loss: 0.7762  loss_cls: 0.3638  loss_box_reg: 0.4061  loss_rpn_cls: 0.00412  loss_rpn_loc: 0.006304    time: 0.3347  last_time: 0.3359  data_time: 0.0161  last_data_time: 0.0157   lr: 9.99e-05  max_mem: 2217M
[06/11 20:56:41 d2.utils.events]:  eta: 0:14:16  iter: 419  total_loss: 0.8717  loss_cls: 0.3859  loss_box_reg: 0.4533  loss_rpn_cls: 0.0007716  loss_rpn_loc: 0.006375    time: 0.3341  last_time: 0.3423  data_time: 0.0062  last_data_time: 0.0059   lr: 0.0001049  max_mem: 2217M
[06/11 20:56:48 d2.utils.events]:  eta: 0:14:05  iter: 439  total_loss: 0.8214  loss_cls: 0.3955  loss_box_reg: 0.4292  loss_rpn_cls: 0.0003753  loss_rpn_loc: 0.004902    time: 0.3326  last_time: 0.3876  data_time: 0.0098  last_data_time: 0.0285   lr: 0.00010989  max_mem: 2217M
[06/11 20:56:55 d2.utils.events]:  eta: 0:14:00  iter: 459  total_loss: 0.78  loss_cls: 0.3498  loss_box_reg: 0.4073  loss_rpn_cls: 0.000774  loss_rpn_loc: 0.005096    time: 0.3333  last_time: 0.3120  data_time: 0.0143  last_data_time: 0.0049   lr: 0.00011489  max_mem: 2217M
[06/11 20:57:01 d2.utils.events]:  eta: 0:13:56  iter: 479  total_loss: 0.7896  loss_cls: 0.3675  loss_box_reg: 0.399  loss_rpn_cls: 0.001622  loss_rpn_loc: 0.003566    time: 0.3335  last_time: 0.3733  data_time: 0.0098  last_data_time: 0.0104   lr: 0.00011988  max_mem: 2217M
[06/11 20:57:08 d2.utils.events]:  eta: 0:13:50  iter: 499  total_loss: 0.8286  loss_cls: 0.384  loss_box_reg: 0.4231  loss_rpn_cls: 0.002249  loss_rpn_loc: 0.004261    time: 0.3338  last_time: 0.2787  data_time: 0.0131  last_data_time: 0.0067   lr: 0.00012488  max_mem: 2218M
[06/11 20:57:15 d2.utils.events]:  eta: 0:13:40  iter: 519  total_loss: 0.8872  loss_cls: 0.4019  loss_box_reg: 0.466  loss_rpn_cls: 0.001423  loss_rpn_loc: 0.005376    time: 0.3334  last_time: 0.3087  data_time: 0.0080  last_data_time: 0.0055   lr: 0.00012987  max_mem: 2218M
[06/11 20:57:21 d2.utils.events]:  eta: 0:13:32  iter: 539  total_loss: 0.7631  loss_cls: 0.3642  loss_box_reg: 0.3939  loss_rpn_cls: 0.000713  loss_rpn_loc: 0.005735    time: 0.3337  last_time: 0.2724  data_time: 0.0121  last_data_time: 0.0046   lr: 0.00013487  max_mem: 2218M
[06/11 20:57:28 d2.utils.events]:  eta: 0:13:22  iter: 559  total_loss: 0.694  loss_cls: 0.3117  loss_box_reg: 0.3507  loss_rpn_cls: 0.0009181  loss_rpn_loc: 0.006612    time: 0.3326  last_time: 0.3431  data_time: 0.0050  last_data_time: 0.0051   lr: 0.00013986  max_mem: 2218M
[06/11 20:57:34 d2.utils.events]:  eta: 0:13:11  iter: 579  total_loss: 0.7267  loss_cls: 0.3439  loss_box_reg: 0.3906  loss_rpn_cls: 0.001198  loss_rpn_loc: 0.007482    time: 0.3319  last_time: 0.3562  data_time: 0.0053  last_data_time: 0.0050   lr: 0.00014486  max_mem: 2218M
[06/11 20:57:40 d2.utils.events]:  eta: 0:13:06  iter: 599  total_loss: 0.8172  loss_cls: 0.3508  loss_box_reg: 0.4447  loss_rpn_cls: 0.0005779  loss_rpn_loc: 0.006946    time: 0.3319  last_time: 0.2956  data_time: 0.0101  last_data_time: 0.0050   lr: 0.00014985  max_mem: 2218M
[06/11 20:57:48 d2.utils.events]:  eta: 0:13:05  iter: 619  total_loss: 0.8725  loss_cls: 0.3902  loss_box_reg: 0.4771  loss_rpn_cls: 0.0007  loss_rpn_loc: 0.005328    time: 0.3340  last_time: 0.3571  data_time: 0.0181  last_data_time: 0.0102   lr: 0.00015485  max_mem: 2218M
[06/11 20:57:55 d2.utils.events]:  eta: 0:12:54  iter: 639  total_loss: 0.9314  loss_cls: 0.4049  loss_box_reg: 0.507  loss_rpn_cls: 0.000292  loss_rpn_loc: 0.005224    time: 0.3334  last_time: 0.3108  data_time: 0.0073  last_data_time: 0.0050   lr: 0.00015984  max_mem: 2218M
[06/11 20:58:01 d2.utils.events]:  eta: 0:12:47  iter: 659  total_loss: 0.7199  loss_cls: 0.3144  loss_box_reg: 0.4106  loss_rpn_cls: 0.0004811  loss_rpn_loc: 0.00567    time: 0.3333  last_time: 0.3417  data_time: 0.0134  last_data_time: 0.0055   lr: 0.00016484  max_mem: 2218M
[06/11 20:58:08 d2.utils.events]:  eta: 0:12:41  iter: 679  total_loss: 0.7731  loss_cls: 0.3509  loss_box_reg: 0.4526  loss_rpn_cls: 0.0002501  loss_rpn_loc: 0.005878    time: 0.3332  last_time: 0.4429  data_time: 0.0092  last_data_time: 0.0047   lr: 0.00016983  max_mem: 2218M
[06/11 20:58:15 d2.utils.events]:  eta: 0:12:36  iter: 699  total_loss: 0.7569  loss_cls: 0.3467  loss_box_reg: 0.4137  loss_rpn_cls: 0.0005405  loss_rpn_loc: 0.007221    time: 0.3344  last_time: 0.2899  data_time: 0.0160  last_data_time: 0.0054   lr: 0.00017483  max_mem: 2218M
[06/11 20:58:22 d2.utils.events]:  eta: 0:12:30  iter: 719  total_loss: 0.6856  loss_cls: 0.3054  loss_box_reg: 0.3784  loss_rpn_cls: 0.0002139  loss_rpn_loc: 0.006964    time: 0.3342  last_time: 0.3405  data_time: 0.0087  last_data_time: 0.0275   lr: 0.00017982  max_mem: 2218M
[06/11 20:58:29 d2.utils.events]:  eta: 0:12:26  iter: 739  total_loss: 0.7881  loss_cls: 0.3569  loss_box_reg: 0.4203  loss_rpn_cls: 0.0006996  loss_rpn_loc: 0.005493    time: 0.3347  last_time: 0.2954  data_time: 0.0147  last_data_time: 0.0060   lr: 0.00018482  max_mem: 2218M
[06/11 20:58:35 d2.utils.events]:  eta: 0:12:18  iter: 759  total_loss: 0.727  loss_cls: 0.3232  loss_box_reg: 0.4072  loss_rpn_cls: 0.0002494  loss_rpn_loc: 0.005371    time: 0.3343  last_time: 0.2573  data_time: 0.0065  last_data_time: 0.0065   lr: 0.00018981  max_mem: 2218M
[06/11 20:58:42 d2.utils.events]:  eta: 0:12:13  iter: 779  total_loss: 0.8186  loss_cls: 0.3628  loss_box_reg: 0.3819  loss_rpn_cls: 0.0003109  loss_rpn_loc: 0.006567    time: 0.3347  last_time: 0.3026  data_time: 0.0118  last_data_time: 0.0218   lr: 0.00019481  max_mem: 2218M
[06/11 20:58:49 d2.utils.events]:  eta: 0:12:06  iter: 799  total_loss: 0.7915  loss_cls: 0.414  loss_box_reg: 0.4233  loss_rpn_cls: 0.0004654  loss_rpn_loc: 0.006218    time: 0.3342  last_time: 0.3506  data_time: 0.0067  last_data_time: 0.0047   lr: 0.0001998  max_mem: 2218M
[06/11 20:58:55 d2.utils.events]:  eta: 0:12:00  iter: 819  total_loss: 0.6223  loss_cls: 0.2911  loss_box_reg: 0.335  loss_rpn_cls: 0.0002921  loss_rpn_loc: 0.009729    time: 0.3341  last_time: 0.2668  data_time: 0.0067  last_data_time: 0.0044   lr: 0.0002048  max_mem: 2218M
[06/11 20:59:02 d2.utils.events]:  eta: 0:11:53  iter: 839  total_loss: 0.8161  loss_cls: 0.3734  loss_box_reg: 0.3901  loss_rpn_cls: 0.001278  loss_rpn_loc: 0.006423    time: 0.3336  last_time: 0.3128  data_time: 0.0088  last_data_time: 0.0050   lr: 0.00020979  max_mem: 2218M
[06/11 20:59:08 d2.utils.events]:  eta: 0:11:46  iter: 859  total_loss: 0.6722  loss_cls: 0.2925  loss_box_reg: 0.3499  loss_rpn_cls: 9.046e-05  loss_rpn_loc: 0.006181    time: 0.3335  last_time: 0.4798  data_time: 0.0095  last_data_time: 0.0283   lr: 0.00021479  max_mem: 2218M
[06/11 20:59:15 d2.utils.events]:  eta: 0:11:40  iter: 879  total_loss: 0.6922  loss_cls: 0.2954  loss_box_reg: 0.3084  loss_rpn_cls: 7.47e-05  loss_rpn_loc: 0.006868    time: 0.3334  last_time: 0.3343  data_time: 0.0102  last_data_time: 0.0256   lr: 0.00021978  max_mem: 2218M
[06/11 20:59:21 d2.utils.events]:  eta: 0:11:33  iter: 899  total_loss: 0.6393  loss_cls: 0.3254  loss_box_reg: 0.327  loss_rpn_cls: 0.0004109  loss_rpn_loc: 0.00786    time: 0.3333  last_time: 0.2957  data_time: 0.0134  last_data_time: 0.0078   lr: 0.00022478  max_mem: 2218M
[06/11 20:59:27 d2.utils.events]:  eta: 0:11:25  iter: 919  total_loss: 0.6795  loss_cls: 0.3317  loss_box_reg: 0.2841  loss_rpn_cls: 0.0004763  loss_rpn_loc: 0.007004    time: 0.3325  last_time: 0.2376  data_time: 0.0054  last_data_time: 0.0048   lr: 0.00022977  max_mem: 2218M
[06/11 20:59:33 d2.utils.events]:  eta: 0:11:16  iter: 939  total_loss: 0.6832  loss_cls: 0.3342  loss_box_reg: 0.3364  loss_rpn_cls: 0.0007745  loss_rpn_loc: 0.006719    time: 0.3320  last_time: 0.3454  data_time: 0.0064  last_data_time: 0.0054   lr: 0.00023477  max_mem: 2218M
[06/11 20:59:40 d2.utils.events]:  eta: 0:11:09  iter: 959  total_loss: 0.6181  loss_cls: 0.2955  loss_box_reg: 0.2988  loss_rpn_cls: 0.0001147  loss_rpn_loc: 0.007234    time: 0.3317  last_time: 0.2678  data_time: 0.0061  last_data_time: 0.0045   lr: 0.00023976  max_mem: 2218M
[06/11 20:59:46 d2.utils.events]:  eta: 0:11:00  iter: 979  total_loss: 0.5489  loss_cls: 0.2754  loss_box_reg: 0.2245  loss_rpn_cls: 0.0001649  loss_rpn_loc: 0.006005    time: 0.3313  last_time: 0.2956  data_time: 0.0062  last_data_time: 0.0106   lr: 0.00024476  max_mem: 2218M
[06/11 20:59:52 d2.utils.events]:  eta: 0:10:53  iter: 999  total_loss: 0.5991  loss_cls: 0.3115  loss_box_reg: 0.2741  loss_rpn_cls: 0.0002821  loss_rpn_loc: 0.006175    time: 0.3311  last_time: 0.2565  data_time: 0.0095  last_data_time: 0.0046   lr: 0.00024975  max_mem: 2218M
[06/11 20:59:59 d2.utils.events]:  eta: 0:10:46  iter: 1019  total_loss: 0.6334  loss_cls: 0.2925  loss_box_reg: 0.297  loss_rpn_cls: 0.0001673  loss_rpn_loc: 0.005816    time: 0.3312  last_time: 0.3519  data_time: 0.0056  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:00:06 d2.utils.events]:  eta: 0:10:39  iter: 1039  total_loss: 0.549  loss_cls: 0.2704  loss_box_reg: 0.2546  loss_rpn_cls: 0.000197  loss_rpn_loc: 0.005528    time: 0.3309  last_time: 0.2737  data_time: 0.0086  last_data_time: 0.0046   lr: 0.00025  max_mem: 2218M
[06/11 21:00:12 d2.utils.events]:  eta: 0:10:33  iter: 1059  total_loss: 0.5445  loss_cls: 0.2469  loss_box_reg: 0.2932  loss_rpn_cls: 0.0001312  loss_rpn_loc: 0.004521    time: 0.3306  last_time: 0.3002  data_time: 0.0050  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:00:18 d2.utils.events]:  eta: 0:10:25  iter: 1079  total_loss: 0.5003  loss_cls: 0.2023  loss_box_reg: 0.2834  loss_rpn_cls: 0.0001044  loss_rpn_loc: 0.005008    time: 0.3304  last_time: 0.3454  data_time: 0.0090  last_data_time: 0.0051   lr: 0.00025  max_mem: 2218M
[06/11 21:00:25 d2.utils.events]:  eta: 0:10:16  iter: 1099  total_loss: 0.4559  loss_cls: 0.2002  loss_box_reg: 0.2353  loss_rpn_cls: 2.074e-05  loss_rpn_loc: 0.006295    time: 0.3301  last_time: 0.3452  data_time: 0.0052  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:00:31 d2.utils.events]:  eta: 0:10:12  iter: 1119  total_loss: 0.6153  loss_cls: 0.286  loss_box_reg: 0.274  loss_rpn_cls: 9.873e-05  loss_rpn_loc: 0.006061    time: 0.3300  last_time: 0.2967  data_time: 0.0077  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:00:37 d2.utils.events]:  eta: 0:10:05  iter: 1139  total_loss: 0.4861  loss_cls: 0.2341  loss_box_reg: 0.231  loss_rpn_cls: 0.0005688  loss_rpn_loc: 0.005419    time: 0.3296  last_time: 0.3108  data_time: 0.0053  last_data_time: 0.0063   lr: 0.00025  max_mem: 2218M
[06/11 21:00:44 d2.utils.events]:  eta: 0:09:57  iter: 1159  total_loss: 0.4737  loss_cls: 0.2107  loss_box_reg: 0.2304  loss_rpn_cls: 0.0004601  loss_rpn_loc: 0.006469    time: 0.3297  last_time: 0.3081  data_time: 0.0084  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:00:50 d2.utils.events]:  eta: 0:09:49  iter: 1179  total_loss: 0.5481  loss_cls: 0.2358  loss_box_reg: 0.2864  loss_rpn_cls: 3.239e-05  loss_rpn_loc: 0.006619    time: 0.3290  last_time: 0.2664  data_time: 0.0052  last_data_time: 0.0046   lr: 0.00025  max_mem: 2218M
[06/11 21:00:56 d2.utils.events]:  eta: 0:09:39  iter: 1199  total_loss: 0.4314  loss_cls: 0.1728  loss_box_reg: 0.2758  loss_rpn_cls: 0.0003832  loss_rpn_loc: 0.006478    time: 0.3286  last_time: 0.3389  data_time: 0.0105  last_data_time: 0.0293   lr: 0.00025  max_mem: 2218M
[06/11 21:01:02 d2.utils.events]:  eta: 0:09:35  iter: 1219  total_loss: 0.4882  loss_cls: 0.2205  loss_box_reg: 0.2317  loss_rpn_cls: 7.18e-05  loss_rpn_loc: 0.006458    time: 0.3284  last_time: 0.3453  data_time: 0.0054  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:01:08 d2.utils.events]:  eta: 0:09:24  iter: 1239  total_loss: 0.396  loss_cls: 0.2005  loss_box_reg: 0.2079  loss_rpn_cls: 5.427e-05  loss_rpn_loc: 0.004518    time: 0.3281  last_time: 0.2820  data_time: 0.0103  last_data_time: 0.0188   lr: 0.00025  max_mem: 2218M
[06/11 21:01:14 d2.utils.events]:  eta: 0:09:17  iter: 1259  total_loss: 0.4121  loss_cls: 0.1573  loss_box_reg: 0.22  loss_rpn_cls: 5.833e-05  loss_rpn_loc: 0.00476    time: 0.3277  last_time: 0.3093  data_time: 0.0079  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:01:21 d2.utils.events]:  eta: 0:09:09  iter: 1279  total_loss: 0.4953  loss_cls: 0.169  loss_box_reg: 0.2994  loss_rpn_cls: 0.0006316  loss_rpn_loc: 0.007716    time: 0.3280  last_time: 0.4352  data_time: 0.0087  last_data_time: 0.0451   lr: 0.00025  max_mem: 2218M
[06/11 21:01:29 d2.utils.events]:  eta: 0:09:04  iter: 1299  total_loss: 0.4377  loss_cls: 0.1572  loss_box_reg: 0.2617  loss_rpn_cls: 9.029e-05  loss_rpn_loc: 0.006076    time: 0.3285  last_time: 0.2924  data_time: 0.0119  last_data_time: 0.0045   lr: 0.00025  max_mem: 2218M
[06/11 21:01:35 d2.utils.events]:  eta: 0:08:55  iter: 1319  total_loss: 0.4284  loss_cls: 0.17  loss_box_reg: 0.2421  loss_rpn_cls: 0.0001842  loss_rpn_loc: 0.006492    time: 0.3282  last_time: 0.2862  data_time: 0.0057  last_data_time: 0.0186   lr: 0.00025  max_mem: 2218M
[06/11 21:01:41 d2.utils.events]:  eta: 0:08:48  iter: 1339  total_loss: 0.4234  loss_cls: 0.2015  loss_box_reg: 0.2221  loss_rpn_cls: 0.0004486  loss_rpn_loc: 0.004668    time: 0.3279  last_time: 0.3132  data_time: 0.0104  last_data_time: 0.0062   lr: 0.00025  max_mem: 2218M
[06/11 21:01:48 d2.utils.events]:  eta: 0:08:42  iter: 1359  total_loss: 0.3611  loss_cls: 0.175  loss_box_reg: 0.1585  loss_rpn_cls: 9.134e-05  loss_rpn_loc: 0.00541    time: 0.3280  last_time: 0.3175  data_time: 0.0093  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:01:54 d2.utils.events]:  eta: 0:08:36  iter: 1379  total_loss: 0.4003  loss_cls: 0.1725  loss_box_reg: 0.1868  loss_rpn_cls: 0.0001827  loss_rpn_loc: 0.004667    time: 0.3278  last_time: 0.2959  data_time: 0.0087  last_data_time: 0.0046   lr: 0.00025  max_mem: 2218M
[06/11 21:02:01 d2.utils.events]:  eta: 0:08:28  iter: 1399  total_loss: 0.4366  loss_cls: 0.1577  loss_box_reg: 0.2281  loss_rpn_cls: 0.0002788  loss_rpn_loc: 0.004228    time: 0.3280  last_time: 0.3168  data_time: 0.0168  last_data_time: 0.0261   lr: 0.00025  max_mem: 2218M
[06/11 21:02:08 d2.utils.events]:  eta: 0:08:23  iter: 1419  total_loss: 0.3421  loss_cls: 0.1476  loss_box_reg: 0.1583  loss_rpn_cls: 6.685e-05  loss_rpn_loc: 0.004714    time: 0.3286  last_time: 0.3122  data_time: 0.0154  last_data_time: 0.0049   lr: 0.00025  max_mem: 2218M
[06/11 21:02:14 d2.utils.events]:  eta: 0:08:16  iter: 1439  total_loss: 0.3997  loss_cls: 0.1892  loss_box_reg: 0.2217  loss_rpn_cls: 7.186e-05  loss_rpn_loc: 0.006124    time: 0.3284  last_time: 0.2963  data_time: 0.0049  last_data_time: 0.0049   lr: 0.00025  max_mem: 2218M
[06/11 21:02:21 d2.utils.events]:  eta: 0:08:09  iter: 1459  total_loss: 0.4564  loss_cls: 0.1968  loss_box_reg: 0.2649  loss_rpn_cls: 0.0002299  loss_rpn_loc: 0.0066    time: 0.3282  last_time: 0.3447  data_time: 0.0073  last_data_time: 0.0064   lr: 0.00025  max_mem: 2218M
[06/11 21:02:27 d2.utils.events]:  eta: 0:08:02  iter: 1479  total_loss: 0.4186  loss_cls: 0.159  loss_box_reg: 0.2318  loss_rpn_cls: 6.223e-05  loss_rpn_loc: 0.003998    time: 0.3280  last_time: 0.3427  data_time: 0.0059  last_data_time: 0.0066   lr: 0.00025  max_mem: 2218M
[06/11 21:02:34 d2.utils.events]:  eta: 0:07:55  iter: 1499  total_loss: 0.3373  loss_cls: 0.1544  loss_box_reg: 0.1728  loss_rpn_cls: 0.0001333  loss_rpn_loc: 0.004816    time: 0.3279  last_time: 0.3069  data_time: 0.0061  last_data_time: 0.0043   lr: 0.00025  max_mem: 2218M
[06/11 21:02:40 d2.utils.events]:  eta: 0:07:48  iter: 1519  total_loss: 0.3352  loss_cls: 0.1322  loss_box_reg: 0.1965  loss_rpn_cls: 8.195e-05  loss_rpn_loc: 0.004037    time: 0.3277  last_time: 0.2936  data_time: 0.0069  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:02:46 d2.utils.events]:  eta: 0:07:41  iter: 1539  total_loss: 0.4117  loss_cls: 0.153  loss_box_reg: 0.2325  loss_rpn_cls: 0.0001494  loss_rpn_loc: 0.005782    time: 0.3276  last_time: 0.3607  data_time: 0.0090  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:02:52 d2.utils.events]:  eta: 0:07:35  iter: 1559  total_loss: 0.3234  loss_cls: 0.151  loss_box_reg: 0.1514  loss_rpn_cls: 0.0004009  loss_rpn_loc: 0.005099    time: 0.3274  last_time: 0.3129  data_time: 0.0052  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:02:59 d2.utils.events]:  eta: 0:07:30  iter: 1579  total_loss: 0.3495  loss_cls: 0.1342  loss_box_reg: 0.2021  loss_rpn_cls: 0.0001153  loss_rpn_loc: 0.005265    time: 0.3274  last_time: 0.3121  data_time: 0.0091  last_data_time: 0.0062   lr: 0.00025  max_mem: 2218M
[06/11 21:03:05 d2.utils.events]:  eta: 0:07:21  iter: 1599  total_loss: 0.3234  loss_cls: 0.1334  loss_box_reg: 0.1863  loss_rpn_cls: 9.83e-05  loss_rpn_loc: 0.00366    time: 0.3270  last_time: 0.2665  data_time: 0.0049  last_data_time: 0.0046   lr: 0.00025  max_mem: 2218M
[06/11 21:03:11 d2.utils.events]:  eta: 0:07:13  iter: 1619  total_loss: 0.363  loss_cls: 0.1332  loss_box_reg: 0.2275  loss_rpn_cls: 0.0002653  loss_rpn_loc: 0.005119    time: 0.3268  last_time: 0.3648  data_time: 0.0070  last_data_time: 0.0175   lr: 0.00025  max_mem: 2218M
[06/11 21:03:17 d2.utils.events]:  eta: 0:07:07  iter: 1639  total_loss: 0.3215  loss_cls: 0.1317  loss_box_reg: 0.167  loss_rpn_cls: 9.09e-05  loss_rpn_loc: 0.005041    time: 0.3266  last_time: 0.2720  data_time: 0.0065  last_data_time: 0.0050   lr: 0.00025  max_mem: 2218M
[06/11 21:03:24 d2.utils.events]:  eta: 0:07:01  iter: 1659  total_loss: 0.4282  loss_cls: 0.1621  loss_box_reg: 0.2203  loss_rpn_cls: 0.0001808  loss_rpn_loc: 0.004345    time: 0.3267  last_time: 0.4113  data_time: 0.0116  last_data_time: 0.0263   lr: 0.00025  max_mem: 2218M
[06/11 21:03:30 d2.utils.events]:  eta: 0:06:54  iter: 1679  total_loss: 0.3419  loss_cls: 0.1329  loss_box_reg: 0.212  loss_rpn_cls: 8.693e-05  loss_rpn_loc: 0.004395    time: 0.3266  last_time: 0.3012  data_time: 0.0071  last_data_time: 0.0051   lr: 0.00025  max_mem: 2218M
[06/11 21:03:37 d2.utils.events]:  eta: 0:06:48  iter: 1699  total_loss: 0.3189  loss_cls: 0.1129  loss_box_reg: 0.1885  loss_rpn_cls: 0.0001462  loss_rpn_loc: 0.003627    time: 0.3268  last_time: 0.3277  data_time: 0.0124  last_data_time: 0.0046   lr: 0.00025  max_mem: 2218M
[06/11 21:03:45 d2.utils.events]:  eta: 0:06:43  iter: 1719  total_loss: 0.3225  loss_cls: 0.1019  loss_box_reg: 0.205  loss_rpn_cls: 0.0006613  loss_rpn_loc: 0.006122    time: 0.3275  last_time: 0.2636  data_time: 0.0170  last_data_time: 0.0077   lr: 0.00025  max_mem: 2218M
[06/11 21:03:53 d2.utils.events]:  eta: 0:06:35  iter: 1739  total_loss: 0.342  loss_cls: 0.1151  loss_box_reg: 0.1866  loss_rpn_cls: 0.0001018  loss_rpn_loc: 0.007179    time: 0.3283  last_time: 0.4228  data_time: 0.0158  last_data_time: 0.0356   lr: 0.00025  max_mem: 2218M
[06/11 21:04:00 d2.utils.events]:  eta: 0:06:30  iter: 1759  total_loss: 0.3067  loss_cls: 0.1151  loss_box_reg: 0.1646  loss_rpn_cls: 3.7e-05  loss_rpn_loc: 0.005391    time: 0.3286  last_time: 0.2731  data_time: 0.0151  last_data_time: 0.0053   lr: 0.00025  max_mem: 2218M
[06/11 21:04:06 d2.utils.events]:  eta: 0:06:23  iter: 1779  total_loss: 0.2909  loss_cls: 0.1061  loss_box_reg: 0.1731  loss_rpn_cls: 2.149e-05  loss_rpn_loc: 0.004299    time: 0.3284  last_time: 0.3073  data_time: 0.0077  last_data_time: 0.0042   lr: 0.00025  max_mem: 2218M
[06/11 21:04:12 d2.utils.events]:  eta: 0:06:16  iter: 1799  total_loss: 0.2773  loss_cls: 0.1211  loss_box_reg: 0.1784  loss_rpn_cls: 3.174e-05  loss_rpn_loc: 0.003577    time: 0.3281  last_time: 0.3115  data_time: 0.0077  last_data_time: 0.0056   lr: 0.00025  max_mem: 2218M
[06/11 21:04:18 d2.utils.events]:  eta: 0:06:09  iter: 1819  total_loss: 0.3337  loss_cls: 0.1178  loss_box_reg: 0.1945  loss_rpn_cls: 0.0002643  loss_rpn_loc: 0.004821    time: 0.3278  last_time: 0.3532  data_time: 0.0050  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:04:25 d2.utils.events]:  eta: 0:06:02  iter: 1839  total_loss: 0.2761  loss_cls: 0.1122  loss_box_reg: 0.1429  loss_rpn_cls: 8.814e-05  loss_rpn_loc: 0.00409    time: 0.3278  last_time: 0.3075  data_time: 0.0071  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:04:31 d2.utils.events]:  eta: 0:05:56  iter: 1859  total_loss: 0.321  loss_cls: 0.1166  loss_box_reg: 0.2026  loss_rpn_cls: 0.0001775  loss_rpn_loc: 0.005085    time: 0.3276  last_time: 0.3439  data_time: 0.0049  last_data_time: 0.0045   lr: 0.00025  max_mem: 2218M
[06/11 21:04:37 d2.utils.events]:  eta: 0:05:50  iter: 1879  total_loss: 0.363  loss_cls: 0.1053  loss_box_reg: 0.2117  loss_rpn_cls: 5.974e-05  loss_rpn_loc: 0.005519    time: 0.3275  last_time: 0.2969  data_time: 0.0071  last_data_time: 0.0053   lr: 0.00025  max_mem: 2218M
[06/11 21:04:44 d2.utils.events]:  eta: 0:05:43  iter: 1899  total_loss: 0.338  loss_cls: 0.1135  loss_box_reg: 0.2143  loss_rpn_cls: 0.00014  loss_rpn_loc: 0.005847    time: 0.3272  last_time: 0.2916  data_time: 0.0052  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:04:50 d2.utils.events]:  eta: 0:05:37  iter: 1919  total_loss: 0.3323  loss_cls: 0.09805  loss_box_reg: 0.1975  loss_rpn_cls: 0.0001696  loss_rpn_loc: 0.00432    time: 0.3271  last_time: 0.3409  data_time: 0.0076  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:04:56 d2.utils.events]:  eta: 0:05:31  iter: 1939  total_loss: 0.2782  loss_cls: 0.08081  loss_box_reg: 0.1691  loss_rpn_cls: 0.0001384  loss_rpn_loc: 0.004571    time: 0.3270  last_time: 0.3382  data_time: 0.0066  last_data_time: 0.0049   lr: 0.00025  max_mem: 2218M
[06/11 21:05:03 d2.utils.events]:  eta: 0:05:25  iter: 1959  total_loss: 0.2536  loss_cls: 0.0865  loss_box_reg: 0.1387  loss_rpn_cls: 5.526e-05  loss_rpn_loc: 0.00335    time: 0.3269  last_time: 0.3168  data_time: 0.0104  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:05:09 d2.utils.events]:  eta: 0:05:18  iter: 1979  total_loss: 0.3203  loss_cls: 0.1224  loss_box_reg: 0.1752  loss_rpn_cls: 4.872e-05  loss_rpn_loc: 0.005367    time: 0.3267  last_time: 0.3483  data_time: 0.0064  last_data_time: 0.0054   lr: 0.00025  max_mem: 2218M
[06/11 21:05:15 d2.utils.events]:  eta: 0:05:12  iter: 1999  total_loss: 0.2853  loss_cls: 0.1045  loss_box_reg: 0.1709  loss_rpn_cls: 0.0001333  loss_rpn_loc: 0.003922    time: 0.3266  last_time: 0.3462  data_time: 0.0091  last_data_time: 0.0057   lr: 0.00025  max_mem: 2218M
[06/11 21:05:21 d2.utils.events]:  eta: 0:05:06  iter: 2019  total_loss: 0.3125  loss_cls: 0.07775  loss_box_reg: 0.1947  loss_rpn_cls: 0.000114  loss_rpn_loc: 0.005451    time: 0.3265  last_time: 0.3427  data_time: 0.0056  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:05:27 d2.utils.events]:  eta: 0:04:59  iter: 2039  total_loss: 0.2552  loss_cls: 0.0719  loss_box_reg: 0.2052  loss_rpn_cls: 4.882e-05  loss_rpn_loc: 0.004462    time: 0.3263  last_time: 0.3200  data_time: 0.0081  last_data_time: 0.0090   lr: 0.00025  max_mem: 2218M
[06/11 21:05:34 d2.utils.events]:  eta: 0:04:53  iter: 2059  total_loss: 0.333  loss_cls: 0.1132  loss_box_reg: 0.2057  loss_rpn_cls: 0.0001863  loss_rpn_loc: 0.00425    time: 0.3264  last_time: 0.3552  data_time: 0.0109  last_data_time: 0.0172   lr: 0.00025  max_mem: 2218M
[06/11 21:05:41 d2.utils.events]:  eta: 0:04:47  iter: 2079  total_loss: 0.2965  loss_cls: 0.1062  loss_box_reg: 0.1766  loss_rpn_cls: 9.444e-05  loss_rpn_loc: 0.00451    time: 0.3264  last_time: 0.3705  data_time: 0.0098  last_data_time: 0.0154   lr: 0.00025  max_mem: 2218M
[06/11 21:05:47 d2.utils.events]:  eta: 0:04:41  iter: 2099  total_loss: 0.2316  loss_cls: 0.09937  loss_box_reg: 0.1385  loss_rpn_cls: 0.0003602  loss_rpn_loc: 0.005405    time: 0.3263  last_time: 0.2337  data_time: 0.0081  last_data_time: 0.0045   lr: 0.00025  max_mem: 2218M
[06/11 21:05:53 d2.utils.events]:  eta: 0:04:34  iter: 2119  total_loss: 0.2444  loss_cls: 0.08329  loss_box_reg: 0.1779  loss_rpn_cls: 0.0001832  loss_rpn_loc: 0.003838    time: 0.3261  last_time: 0.3597  data_time: 0.0054  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:05:59 d2.utils.events]:  eta: 0:04:28  iter: 2139  total_loss: 0.2483  loss_cls: 0.08675  loss_box_reg: 0.1461  loss_rpn_cls: 7.929e-05  loss_rpn_loc: 0.003916    time: 0.3260  last_time: 0.2572  data_time: 0.0098  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:06:06 d2.utils.events]:  eta: 0:04:22  iter: 2159  total_loss: 0.2587  loss_cls: 0.09751  loss_box_reg: 0.1534  loss_rpn_cls: 0.0002988  loss_rpn_loc: 0.004308    time: 0.3258  last_time: 0.3063  data_time: 0.0072  last_data_time: 0.0046   lr: 0.00025  max_mem: 2218M
[06/11 21:06:12 d2.utils.events]:  eta: 0:04:16  iter: 2179  total_loss: 0.2298  loss_cls: 0.06958  loss_box_reg: 0.1366  loss_rpn_cls: 3.212e-05  loss_rpn_loc: 0.004234    time: 0.3258  last_time: 0.2995  data_time: 0.0079  last_data_time: 0.0045   lr: 0.00025  max_mem: 2218M
[06/11 21:06:18 d2.utils.events]:  eta: 0:04:10  iter: 2199  total_loss: 0.2822  loss_cls: 0.09901  loss_box_reg: 0.1987  loss_rpn_cls: 0.0001488  loss_rpn_loc: 0.004154    time: 0.3257  last_time: 0.2926  data_time: 0.0051  last_data_time: 0.0045   lr: 0.00025  max_mem: 2218M
[06/11 21:06:25 d2.utils.events]:  eta: 0:04:04  iter: 2219  total_loss: 0.2497  loss_cls: 0.0761  loss_box_reg: 0.1539  loss_rpn_cls: 7.595e-05  loss_rpn_loc: 0.004584    time: 0.3257  last_time: 0.3415  data_time: 0.0123  last_data_time: 0.0064   lr: 0.00025  max_mem: 2218M
[06/11 21:06:31 d2.utils.events]:  eta: 0:03:57  iter: 2239  total_loss: 0.2767  loss_cls: 0.1011  loss_box_reg: 0.1422  loss_rpn_cls: 9.246e-05  loss_rpn_loc: 0.004381    time: 0.3255  last_time: 0.3047  data_time: 0.0051  last_data_time: 0.0045   lr: 0.00025  max_mem: 2218M
[06/11 21:06:37 d2.utils.events]:  eta: 0:03:51  iter: 2259  total_loss: 0.2835  loss_cls: 0.07481  loss_box_reg: 0.1713  loss_rpn_cls: 6.374e-05  loss_rpn_loc: 0.004059    time: 0.3254  last_time: 0.2679  data_time: 0.0075  last_data_time: 0.0051   lr: 0.00025  max_mem: 2218M
[06/11 21:06:44 d2.utils.events]:  eta: 0:03:45  iter: 2279  total_loss: 0.2596  loss_cls: 0.07842  loss_box_reg: 0.1576  loss_rpn_cls: 0.0001082  loss_rpn_loc: 0.005088    time: 0.3253  last_time: 0.3112  data_time: 0.0054  last_data_time: 0.0057   lr: 0.00025  max_mem: 2218M
[06/11 21:06:50 d2.utils.events]:  eta: 0:03:38  iter: 2299  total_loss: 0.2714  loss_cls: 0.07814  loss_box_reg: 0.1625  loss_rpn_cls: 0.0002312  loss_rpn_loc: 0.005346    time: 0.3253  last_time: 0.3140  data_time: 0.0053  last_data_time: 0.0109   lr: 0.00025  max_mem: 2218M
[06/11 21:06:56 d2.utils.events]:  eta: 0:03:32  iter: 2319  total_loss: 0.2722  loss_cls: 0.09712  loss_box_reg: 0.1904  loss_rpn_cls: 0.0001505  loss_rpn_loc: 0.005902    time: 0.3252  last_time: 0.3116  data_time: 0.0050  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:07:03 d2.utils.events]:  eta: 0:03:26  iter: 2339  total_loss: 0.2246  loss_cls: 0.06968  loss_box_reg: 0.1501  loss_rpn_cls: 0.0002359  loss_rpn_loc: 0.004813    time: 0.3254  last_time: 0.5778  data_time: 0.0068  last_data_time: 0.0233   lr: 0.00025  max_mem: 2218M
[06/11 21:07:10 d2.utils.events]:  eta: 0:03:20  iter: 2359  total_loss: 0.252  loss_cls: 0.07903  loss_box_reg: 0.1518  loss_rpn_cls: 0.0001983  loss_rpn_loc: 0.003778    time: 0.3255  last_time: 0.2615  data_time: 0.0118  last_data_time: 0.0064   lr: 0.00025  max_mem: 2218M
[06/11 21:07:16 d2.utils.events]:  eta: 0:03:13  iter: 2379  total_loss: 0.2579  loss_cls: 0.08254  loss_box_reg: 0.1649  loss_rpn_cls: 5.483e-05  loss_rpn_loc: 0.003424    time: 0.3254  last_time: 0.3742  data_time: 0.0057  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:07:22 d2.utils.events]:  eta: 0:03:07  iter: 2399  total_loss: 0.2431  loss_cls: 0.07599  loss_box_reg: 0.1325  loss_rpn_cls: 0.0001741  loss_rpn_loc: 0.004189    time: 0.3251  last_time: 0.2711  data_time: 0.0065  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:07:29 d2.utils.events]:  eta: 0:03:00  iter: 2419  total_loss: 0.2043  loss_cls: 0.0611  loss_box_reg: 0.1453  loss_rpn_cls: 0.0001717  loss_rpn_loc: 0.005442    time: 0.3250  last_time: 0.3395  data_time: 0.0064  last_data_time: 0.0243   lr: 0.00025  max_mem: 2218M
[06/11 21:07:35 d2.utils.events]:  eta: 0:02:54  iter: 2439  total_loss: 0.2081  loss_cls: 0.07673  loss_box_reg: 0.1429  loss_rpn_cls: 0.0001283  loss_rpn_loc: 0.00372    time: 0.3250  last_time: 0.2966  data_time: 0.0061  last_data_time: 0.0062   lr: 0.00025  max_mem: 2218M
[06/11 21:07:41 d2.utils.events]:  eta: 0:02:48  iter: 2459  total_loss: 0.2359  loss_cls: 0.06515  loss_box_reg: 0.1533  loss_rpn_cls: 8.307e-05  loss_rpn_loc: 0.004198    time: 0.3249  last_time: 0.3259  data_time: 0.0055  last_data_time: 0.0109   lr: 0.00025  max_mem: 2218M
[06/11 21:07:48 d2.utils.events]:  eta: 0:02:42  iter: 2479  total_loss: 0.305  loss_cls: 0.07766  loss_box_reg: 0.2111  loss_rpn_cls: 4.833e-05  loss_rpn_loc: 0.00388    time: 0.3248  last_time: 0.3478  data_time: 0.0105  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:07:53 d2.utils.events]:  eta: 0:02:35  iter: 2499  total_loss: 0.2738  loss_cls: 0.08432  loss_box_reg: 0.1767  loss_rpn_cls: 0.0002391  loss_rpn_loc: 0.004558    time: 0.3245  last_time: 0.2761  data_time: 0.0049  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:08:00 d2.utils.events]:  eta: 0:02:29  iter: 2519  total_loss: 0.257  loss_cls: 0.09122  loss_box_reg: 0.1566  loss_rpn_cls: 3.695e-05  loss_rpn_loc: 0.003122    time: 0.3245  last_time: 0.3098  data_time: 0.0089  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:08:06 d2.utils.events]:  eta: 0:02:23  iter: 2539  total_loss: 0.2437  loss_cls: 0.06893  loss_box_reg: 0.161  loss_rpn_cls: 6.195e-05  loss_rpn_loc: 0.005071    time: 0.3244  last_time: 0.3502  data_time: 0.0052  last_data_time: 0.0049   lr: 0.00025  max_mem: 2218M
[06/11 21:08:12 d2.utils.events]:  eta: 0:02:17  iter: 2559  total_loss: 0.2408  loss_cls: 0.08244  loss_box_reg: 0.158  loss_rpn_cls: 4.118e-05  loss_rpn_loc: 0.00372    time: 0.3243  last_time: 0.2949  data_time: 0.0093  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:08:19 d2.utils.events]:  eta: 0:02:10  iter: 2579  total_loss: 0.218  loss_cls: 0.06938  loss_box_reg: 0.1289  loss_rpn_cls: 6.857e-05  loss_rpn_loc: 0.004502    time: 0.3242  last_time: 0.3420  data_time: 0.0053  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:08:25 d2.utils.events]:  eta: 0:02:04  iter: 2599  total_loss: 0.1719  loss_cls: 0.05316  loss_box_reg: 0.1184  loss_rpn_cls: 4.113e-05  loss_rpn_loc: 0.003906    time: 0.3243  last_time: 0.2793  data_time: 0.0061  last_data_time: 0.0051   lr: 0.00025  max_mem: 2218M
[06/11 21:08:31 d2.utils.events]:  eta: 0:01:58  iter: 2619  total_loss: 0.2393  loss_cls: 0.07731  loss_box_reg: 0.1476  loss_rpn_cls: 3.721e-05  loss_rpn_loc: 0.004135    time: 0.3242  last_time: 0.2768  data_time: 0.0055  last_data_time: 0.0055   lr: 0.00025  max_mem: 2218M
[06/11 21:08:38 d2.utils.events]:  eta: 0:01:52  iter: 2639  total_loss: 0.2385  loss_cls: 0.06734  loss_box_reg: 0.1664  loss_rpn_cls: 0.0001021  loss_rpn_loc: 0.004176    time: 0.3241  last_time: 0.3583  data_time: 0.0063  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:08:44 d2.utils.events]:  eta: 0:01:45  iter: 2659  total_loss: 0.2202  loss_cls: 0.06989  loss_box_reg: 0.1527  loss_rpn_cls: 0.0003809  loss_rpn_loc: 0.005514    time: 0.3239  last_time: 0.2708  data_time: 0.0051  last_data_time: 0.0048   lr: 0.00025  max_mem: 2218M
[06/11 21:08:50 d2.utils.events]:  eta: 0:01:39  iter: 2679  total_loss: 0.2392  loss_cls: 0.08593  loss_box_reg: 0.1633  loss_rpn_cls: 0.0002082  loss_rpn_loc: 0.003767    time: 0.3237  last_time: 0.3571  data_time: 0.0049  last_data_time: 0.0054   lr: 0.00025  max_mem: 2218M
[06/11 21:08:56 d2.utils.events]:  eta: 0:01:33  iter: 2699  total_loss: 0.2415  loss_cls: 0.07492  loss_box_reg: 0.1784  loss_rpn_cls: 8.463e-05  loss_rpn_loc: 0.00421    time: 0.3237  last_time: 0.3421  data_time: 0.0070  last_data_time: 0.0053   lr: 0.00025  max_mem: 2218M
[06/11 21:09:02 d2.utils.events]:  eta: 0:01:26  iter: 2719  total_loss: 0.2451  loss_cls: 0.06379  loss_box_reg: 0.1743  loss_rpn_cls: 0.000105  loss_rpn_loc: 0.003866    time: 0.3235  last_time: 0.3113  data_time: 0.0056  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:09:09 d2.utils.events]:  eta: 0:01:20  iter: 2739  total_loss: 0.2603  loss_cls: 0.07099  loss_box_reg: 0.1787  loss_rpn_cls: 2.003e-05  loss_rpn_loc: 0.00361    time: 0.3235  last_time: 0.3096  data_time: 0.0131  last_data_time: 0.0061   lr: 0.00025  max_mem: 2218M
[06/11 21:09:15 d2.utils.events]:  eta: 0:01:14  iter: 2759  total_loss: 0.1849  loss_cls: 0.05711  loss_box_reg: 0.1181  loss_rpn_cls: 7.596e-05  loss_rpn_loc: 0.00374    time: 0.3233  last_time: 0.2934  data_time: 0.0052  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:09:21 d2.utils.events]:  eta: 0:01:08  iter: 2779  total_loss: 0.2484  loss_cls: 0.06847  loss_box_reg: 0.1598  loss_rpn_cls: 9.951e-05  loss_rpn_loc: 0.003478    time: 0.3233  last_time: 0.2545  data_time: 0.0076  last_data_time: 0.0051   lr: 0.00025  max_mem: 2218M
[06/11 21:09:27 d2.utils.events]:  eta: 0:01:02  iter: 2799  total_loss: 0.2122  loss_cls: 0.06452  loss_box_reg: 0.131  loss_rpn_cls: 8.2e-05  loss_rpn_loc: 0.004906    time: 0.3231  last_time: 0.2425  data_time: 0.0051  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:09:33 d2.utils.events]:  eta: 0:00:55  iter: 2819  total_loss: 0.1882  loss_cls: 0.05708  loss_box_reg: 0.1152  loss_rpn_cls: 0.0003531  loss_rpn_loc: 0.005777    time: 0.3231  last_time: 0.3437  data_time: 0.0062  last_data_time: 0.0044   lr: 0.00025  max_mem: 2218M
[06/11 21:09:39 d2.utils.events]:  eta: 0:00:49  iter: 2839  total_loss: 0.1714  loss_cls: 0.04956  loss_box_reg: 0.1133  loss_rpn_cls: 0.0001127  loss_rpn_loc: 0.004324    time: 0.3229  last_time: 0.2722  data_time: 0.0055  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:09:46 d2.utils.events]:  eta: 0:00:43  iter: 2859  total_loss: 0.2628  loss_cls: 0.08182  loss_box_reg: 0.1709  loss_rpn_cls: 0.0001109  loss_rpn_loc: 0.003749    time: 0.3230  last_time: 0.3450  data_time: 0.0067  last_data_time: 0.0047   lr: 0.00025  max_mem: 2218M
[06/11 21:09:52 d2.utils.events]:  eta: 0:00:37  iter: 2879  total_loss: 0.2083  loss_cls: 0.05406  loss_box_reg: 0.1279  loss_rpn_cls: 0.0002865  loss_rpn_loc: 0.00439    time: 0.3229  last_time: 0.2565  data_time: 0.0049  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:09:58 d2.utils.events]:  eta: 0:00:31  iter: 2899  total_loss: 0.2174  loss_cls: 0.06477  loss_box_reg: 0.1444  loss_rpn_cls: 6.589e-05  loss_rpn_loc: 0.004472    time: 0.3228  last_time: 0.3642  data_time: 0.0106  last_data_time: 0.0223   lr: 0.00025  max_mem: 2218M
[06/11 21:10:05 d2.utils.events]:  eta: 0:00:24  iter: 2919  total_loss: 0.2481  loss_cls: 0.06917  loss_box_reg: 0.1704  loss_rpn_cls: 4.68e-05  loss_rpn_loc: 0.00366    time: 0.3228  last_time: 0.3088  data_time: 0.0061  last_data_time: 0.0054   lr: 0.00025  max_mem: 2218M
[06/11 21:10:11 d2.utils.events]:  eta: 0:00:18  iter: 2939  total_loss: 0.2363  loss_cls: 0.06216  loss_box_reg: 0.1743  loss_rpn_cls: 0.0001066  loss_rpn_loc: 0.004005    time: 0.3226  last_time: 0.3270  data_time: 0.0059  last_data_time: 0.0052   lr: 0.00025  max_mem: 2218M
[06/11 21:10:17 d2.utils.events]:  eta: 0:00:12  iter: 2959  total_loss: 0.219  loss_cls: 0.04887  loss_box_reg: 0.1511  loss_rpn_cls: 0.0002576  loss_rpn_loc: 0.004627    time: 0.3226  last_time: 0.3103  data_time: 0.0071  last_data_time: 0.0057   lr: 0.00025  max_mem: 2218M
[06/11 21:10:23 d2.utils.events]:  eta: 0:00:06  iter: 2979  total_loss: 0.2623  loss_cls: 0.05768  loss_box_reg: 0.1946  loss_rpn_cls: 0.0001261  loss_rpn_loc: 0.004949    time: 0.3225  last_time: 0.3102  data_time: 0.0066  last_data_time: 0.0236   lr: 0.00025  max_mem: 2218M
[06/11 21:10:31 d2.utils.events]:  eta: 0:00:00  iter: 2999  total_loss: 0.2202  loss_cls: 0.04791  loss_box_reg: 0.1607  loss_rpn_cls: 6.009e-05  loss_rpn_loc: 0.003821    time: 0.3225  last_time: 0.2949  data_time: 0.0077  last_data_time: 0.0049   lr: 0.00025  max_mem: 2218M
[06/11 21:10:31 d2.engine.hooks]: Overall training speed: 2998 iterations in 0:16:06 (0.3225 s / it)
[06/11 21:10:31 d2.engine.hooks]: Total training time: 0:16:09 (0:00:02 on hooks)
WARNING [06/11 21:10:31 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 21:10:31 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
[06/11 21:10:31 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/11 21:10:31 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 21:10:31 d2.data.common]: Serializing 180 elements to byte tensors and concatenating them all ...
[06/11 21:10:31 d2.data.common]: Serialized dataset takes 0.06 MiB
WARNING [06/11 21:10:31 d2.engine.defaults]: No evaluator found. Use `DefaultTrainer.test(evaluators=)`, or implement its `build_evaluator` method.
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
InΒ [Β ]:
# Evaluate the model
cfg_mask_rcnn.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Reduce threshold
# Create a test data loader
test_loader_mask_rcnn = build_detection_test_loader(cfg_mask_rcnn, "food_val")

# Create an evaluator
evaluator_mask_rcnn = COCOEvaluator("food_val", cfg_mask_rcnn, False, output_dir=cfg_mask_rcnn.OUTPUT_DIR)
trainer.test(cfg_mask_rcnn, trainer.model, evaluators=[evaluator_mask_rcnn])

predictor = DefaultPredictor(cfg_mask_rcnn)
WARNING [06/11 21:12:17 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 21:12:17 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
[06/11 21:12:17 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/11 21:12:17 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 21:12:17 d2.data.common]: Serializing 180 elements to byte tensors and concatenating them all ...
[06/11 21:12:17 d2.data.common]: Serialized dataset takes 0.06 MiB
WARNING [06/11 21:12:17 d2.evaluation.coco_evaluation]: COCO Evaluator instantiated using config, this is deprecated behavior. Please pass in explicit arguments instead.
WARNING [06/11 21:12:17 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 21:12:17 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json
[06/11 21:12:17 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[06/11 21:12:17 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[06/11 21:12:17 d2.data.common]: Serializing 180 elements to byte tensors and concatenating them all ...
[06/11 21:12:17 d2.data.common]: Serialized dataset takes 0.06 MiB
[06/11 21:12:17 d2.evaluation.evaluator]: Start inference on 180 batches
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
[06/11 21:12:19 d2.evaluation.evaluator]: Inference done 11/180. Dataloading: 0.0059 s/iter. Inference: 0.0822 s/iter. Eval: 0.0003 s/iter. Total: 0.0884 s/iter. ETA=0:00:14
[06/11 21:12:24 d2.evaluation.evaluator]: Inference done 67/180. Dataloading: 0.0044 s/iter. Inference: 0.0851 s/iter. Eval: 0.0003 s/iter. Total: 0.0899 s/iter. ETA=0:00:10
[06/11 21:12:29 d2.evaluation.evaluator]: Inference done 127/180. Dataloading: 0.0031 s/iter. Inference: 0.0838 s/iter. Eval: 0.0003 s/iter. Total: 0.0872 s/iter. ETA=0:00:04
[06/11 21:12:33 d2.evaluation.evaluator]: Total inference time: 0:00:15.364600 (0.087798 s / iter per device, on 1 devices)
[06/11 21:12:33 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:14 (0.083905 s / iter per device, on 1 devices)
[06/11 21:12:33 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[06/11 21:12:33 d2.evaluation.coco_evaluation]: Saving results to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_mask_rcnn/coco_instances_results.json
[06/11 21:12:33 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[06/11 21:12:33 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[06/11 21:12:34 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.04 seconds.
[06/11 21:12:34 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[06/11 21:12:34 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.05 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.402
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.571
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.465
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.402
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.488
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.491
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.491
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.491
[06/11 21:12:34 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs  |  APm  |  APl   |
|:------:|:------:|:------:|:-----:|:-----:|:------:|
| 40.210 | 57.106 | 46.451 |  nan  |  nan  | 40.210 |
[06/11 21:12:34 d2.evaluation.coco_evaluation]: Some metrics cannot be computed and is shown as NaN.
[06/11 21:12:34 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category       | AP     | category    | AP     | category       | AP     |
|:---------------|:-------|:------------|:-------|:---------------|:-------|
| food-detection | nan    | apple_pie   | 18.820 | chocolate_cake | 39.393 |
| french_fries   | 37.896 | hot_dog     | 35.805 | ice_cream      | 25.407 |
| nachos         | 54.127 | onion_rings | 54.252 | pancakes       | 46.212 |
| pizza          | 81.089 | ravioli     | 34.593 | samosa         | 32.780 |
| spring_rolls   | 22.152 |             |        |                |        |
[06/11 21:12:34 d2.engine.defaults]: Evaluation results for food_val in csv format:
[06/11 21:12:34 d2.evaluation.testing]: copypaste: Task: bbox
[06/11 21:12:34 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[06/11 21:12:34 d2.evaluation.testing]: copypaste: 40.2105,57.1056,46.4512,nan,nan,40.2105
[06/11 21:12:35 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl ...
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (14, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (14,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (52, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (52,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
WARNING:fvcore.common.checkpoint:The checkpoint state_dict contains keys that are not used by the model:
  roi_heads.mask_head.mask_fcn1.{bias, weight}
  roi_heads.mask_head.mask_fcn2.{bias, weight}
  roi_heads.mask_head.mask_fcn3.{bias, weight}
  roi_heads.mask_head.mask_fcn4.{bias, weight}
  roi_heads.mask_head.deconv.{bias, weight}
  roi_heads.mask_head.predictor.{bias, weight}

Model MetricsΒΆ

InΒ [Β ]:
import json
import matplotlib.pyplot as plt

# Path to evaluation results
metrics_path_mask_rcnn = captsone_project_path + "output_food_detection_mask_rcnn/metrics.json"

# Load metrics from the NDJSON evaluation results
def load_metrics_ndjson(metrics_path):
    metrics = []
    with open(metrics_path, "r") as f:
        for line in f:
            metrics.append(json.loads(line))  # Parse each line as a JSON object
    return metrics


# Plot metrics
def plot_metrics_ndjson(metrics):
    # Extract iterations and losses
    iterations = [m['iteration'] for m in metrics]
    total_loss = [m['total_loss'] for m in metrics]
    loss_box_reg = [m['loss_box_reg'] for m in metrics]
    loss_cls = [m['loss_cls'] for m in metrics]
    fg_cls_accuracy = [m['fast_rcnn/fg_cls_accuracy'] for m in metrics]
    cls_accuracy = [m['fast_rcnn/cls_accuracy'] for m in metrics]
    false_negative = [m['fast_rcnn/false_negative'] for m in metrics]


    # Plot total loss over iterations
    plt.figure(figsize=(10, 6))
    plt.plot(iterations, total_loss, label="Total Loss", marker="o")
    plt.plot(iterations, loss_box_reg, label="Box Regression Loss", marker="o")
    plt.plot(iterations, loss_cls, label="Classification Loss", marker="o")
    plt.plot(iterations, fg_cls_accuracy, label="Foreground Classification Accuracy", marker="o")
    plt.plot(iterations, cls_accuracy, label="Overall Classification Accuracy", marker="o")
    plt.plot(iterations, false_negative, label="False Negative Rate", marker="o")
    plt.xlabel("Iteration")
    plt.ylabel("Loss")
    plt.title("Training Metrics Over Iterations")
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.show()

# Load and showcase metrics
metrics_ndjson_mask_rcnn = load_metrics_ndjson(metrics_path_mask_rcnn)
plot_metrics_ndjson(metrics_ndjson_mask_rcnn)
No description has been provided for this image

ObservationsΒΆ

  • Overall classification accuracy remained stable throught the training.
  • Total loss decrease throughout the training
  • In this model as well we observe the Classification Accuracy improved as number of iterations increase.
InΒ [Β ]:
# Run inference on the validation dataset
# Run inference and evaluation
results_mask_rcnn = inference_on_dataset(trainer.model, test_loader_mask_rcnn, evaluator_mask_rcnn)
[06/11 21:13:28 d2.evaluation.evaluator]: Start inference on 180 batches
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:624: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(
[06/11 21:13:30 d2.evaluation.evaluator]: Inference done 11/180. Dataloading: 0.0017 s/iter. Inference: 0.0913 s/iter. Eval: 0.0004 s/iter. Total: 0.0934 s/iter. ETA=0:00:15
[06/11 21:13:35 d2.evaluation.evaluator]: Inference done 65/180. Dataloading: 0.0030 s/iter. Inference: 0.0900 s/iter. Eval: 0.0003 s/iter. Total: 0.0935 s/iter. ETA=0:00:10
[06/11 21:13:40 d2.evaluation.evaluator]: Inference done 121/180. Dataloading: 0.0033 s/iter. Inference: 0.0884 s/iter. Eval: 0.0003 s/iter. Total: 0.0921 s/iter. ETA=0:00:05
[06/11 21:13:45 d2.evaluation.evaluator]: Inference done 180/180. Dataloading: 0.0028 s/iter. Inference: 0.0866 s/iter. Eval: 0.0003 s/iter. Total: 0.0898 s/iter. ETA=0:00:00
[06/11 21:13:45 d2.evaluation.evaluator]: Total inference time: 0:00:15.806809 (0.090325 s / iter per device, on 1 devices)
[06/11 21:13:45 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:15 (0.086563 s / iter per device, on 1 devices)
[06/11 21:13:45 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[06/11 21:13:45 d2.evaluation.coco_evaluation]: Saving results to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_mask_rcnn/coco_instances_results.json
[06/11 21:13:45 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[06/11 21:13:45 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[06/11 21:13:45 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.03 seconds.
[06/11 21:13:45 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[06/11 21:13:45 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.04 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.402
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.571
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.465
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.402
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.488
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.491
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.491
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.491
[06/11 21:13:45 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs  |  APm  |  APl   |
|:------:|:------:|:------:|:-----:|:-----:|:------:|
| 40.210 | 57.106 | 46.451 |  nan  |  nan  | 40.210 |
[06/11 21:13:45 d2.evaluation.coco_evaluation]: Some metrics cannot be computed and is shown as NaN.
[06/11 21:13:45 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category       | AP     | category    | AP     | category       | AP     |
|:---------------|:-------|:------------|:-------|:---------------|:-------|
| food-detection | nan    | apple_pie   | 18.820 | chocolate_cake | 39.393 |
| french_fries   | 37.896 | hot_dog     | 35.805 | ice_cream      | 25.407 |
| nachos         | 54.127 | onion_rings | 54.252 | pancakes       | 46.212 |
| pizza          | 81.089 | ravioli     | 34.593 | samosa         | 32.780 |
| spring_rolls   | 22.152 |             |        |                |        |
InΒ [Β ]:
print("Evaluation Results:")
for key, value in results_mask_rcnn.items():
    if isinstance(value, dict):  # Handle nested dictionaries
        print(f"{key}:")
        for sub_key, sub_value in value.items():
            print(f"  {sub_key}: {sub_value:.4f}" if isinstance(sub_value, (int, float)) else f"  {sub_key}: {sub_value}")
    else:
        print(f"{key}: {value:.4f}" if isinstance(value, (int, float)) else f"{key}: {value}")
Evaluation Results:
bbox:
  AP: 40.2105
  AP50: 57.1056
  AP75: 46.4512
  APs: nan
  APm: nan
  APl: 40.2105
  AP-food-detection: nan
  AP-apple_pie: 18.8203
  AP-chocolate_cake: 39.3929
  AP-french_fries: 37.8963
  AP-hot_dog: 35.8048
  AP-ice_cream: 25.4071
  AP-nachos: 54.1267
  AP-onion_rings: 54.2525
  AP-pancakes: 46.2118
  AP-pizza: 81.0891
  AP-ravioli: 34.5932
  AP-samosa: 32.7796
  AP-spring_rolls: 22.1515
InΒ [Β ]:
flattened_results = {}
for key, value in results_mask_rcnn.items():
    if isinstance(value, dict):  # Handle nested dictionaries
        for sub_key, sub_value in value.items():
            flattened_results[f"{key}_{sub_key}"] = sub_value
    else:
        flattened_results[key] = value

# Extract keys and values for plotting
keys = list(flattened_results.keys())
values = [flattened_results[key] for key in keys]

# Create a bar chart
plt.figure(figsize=(10, 6))
bars = plt.bar(keys, values, color="skyblue")

# Add percentage labels on top of the bars
for bar in bars:
    height = bar.get_height()
    plt.text(
        bar.get_x() + bar.get_width() / 2,  # X-coordinate
        height + 0.01,  # Y-coordinate (slightly above the bar)
        f"{height :.2f}%",  # Format as percentage
        ha="center",  # Horizontal alignment
        va="bottom",  # Vertical alignment
        fontsize=10,  # Font size
        color="black"  # Text color
    )
plt.title("COCO Evaluation Metrics")
plt.xlabel("Metric")
plt.ylabel("Value")
plt.xticks(rotation=90)
plt.tight_layout()
plt.show()
No description has been provided for this image

ObservationΒΆ

  • Overall Bounding Box Average Precision is 40.21% higher that the Faster RCNN Model. The average precision is a solid start for the model
  • Bounding Box Average Precision is highest for pizza 81.09%, followed by onion_rings 54.25%, nachos 54.13%, pancakes 46.21% and choclate cake 39.39%
  • Classes like apple pie with Average Precision 18.82%, samosa 32.78%, spring_rolls 22.60% were lowest performing
  • Overall the Mask RCNN model perfroms better than te Faster RCNN based better average precison and also better performance of classes which have low performance
InΒ [Β ]:
# Load Predicted annotated file generated by above predictor
results_path = captsone_project_path + "output_food_detection_mask_rcnn/coco_instances_results.json"

# Load the predictions from the JSON file
with open(results_path, "r") as f:
    predictions_mask_rcnn = json.load(f)

dataset_dicts_validation = DatasetCatalog.get("food_val")
WARNING [06/11 21:23:13 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/11 21:23:13 d2.data.datasets.coco]: Loaded 180 images in COCO format from /content/drive/MyDrive/shortcuts/Capstone/fooddetection-cap-cv3-may24b.v7i.coco/test/_annotations.coco.json

Detect Objects and Visualize the ResultsΒΆ

InΒ [Β ]:
# Visualize the output in tabular format
random_images_validation = random.sample(dataset_dicts_validation, 5)  # Select 5 random images
visualize_output_in_table(random_images_validation, predictions_mask_rcnn, food_metadata)
No description has been provided for this image

Pickle Model for future predictionsΒΆ

InΒ [Β ]:
import pickle
import torch

# Save the trained model weights and configuration
def save_model(trainer, cfg, output_dir):
    # Save model weights
    model_weights_path = os.path.join(output_dir, "model_final.pth")
    torch.save(trainer.model.state_dict(), model_weights_path)
    print(f"Model weights saved to {model_weights_path}")

    # Save configuration
    config_path = os.path.join(output_dir, "config.pkl")
    with open(config_path, "wb") as f:
        pickle.dump(cfg, f)
    print(f"Model configuration saved to {config_path}")

# Load the model weights and configuration for future predictions
def load_model(output_dir):
    # Load configuration
    config_path = os.path.join(output_dir, "config.pkl")
    with open(config_path, "rb") as f:
        cfg = pickle.load(f)
    print(f"Model configuration loaded from {config_path}")

    # Load model weights
    model_weights_path = os.path.join(output_dir, "model_final.pth")
    cfg.MODEL.WEIGHTS = model_weights_path
    print(f"Model weights loaded from {model_weights_path}")

    # Create predictor
    predictor = DefaultPredictor(cfg)
    return predictor

# Save the model
save_model(trainer, cfg_mask_rcnn, cfg_mask_rcnn.OUTPUT_DIR)
Model weights saved to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_mask_rcnn/model_final.pth
Model configuration saved to /content/drive/MyDrive/shortcuts/Capstone/output_food_detection_mask_rcnn/config.pkl

Clickable GUI (Runs in local system)ΒΆ

  • Using Stream lit to create a clickable GUI
  • The GUI runs in localhost http://localhost:8501/
  • GUI has a browse option to select and input image
  • Once the image is loaded the orignal image and the predictions are displayed side by side along with class label and confidence intervals
  • Screenshot below

image.png

InΒ [Β ]:
!pip install streamlit
Collecting streamlit
  Downloading streamlit-1.45.1-py3-none-any.whl.metadata (8.9 kB)
Requirement already satisfied: altair<6,>=4.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (5.5.0)
Requirement already satisfied: blinker<2,>=1.5.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (1.9.0)
Requirement already satisfied: cachetools<6,>=4.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (5.5.2)
Requirement already satisfied: click<9,>=7.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (8.2.1)
Requirement already satisfied: numpy<3,>=1.23 in /usr/local/lib/python3.11/dist-packages (from streamlit) (2.0.2)
Requirement already satisfied: packaging<25,>=20 in /usr/local/lib/python3.11/dist-packages (from streamlit) (24.2)
Requirement already satisfied: pandas<3,>=1.4.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (2.2.2)
Requirement already satisfied: pillow<12,>=7.1.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (11.2.1)
Requirement already satisfied: protobuf<7,>=3.20 in /usr/local/lib/python3.11/dist-packages (from streamlit) (5.29.5)
Requirement already satisfied: pyarrow>=7.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (18.1.0)
Requirement already satisfied: requests<3,>=2.27 in /usr/local/lib/python3.11/dist-packages (from streamlit) (2.32.3)
Requirement already satisfied: tenacity<10,>=8.1.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (9.1.2)
Requirement already satisfied: toml<2,>=0.10.1 in /usr/local/lib/python3.11/dist-packages (from streamlit) (0.10.2)
Requirement already satisfied: typing-extensions<5,>=4.4.0 in /usr/local/lib/python3.11/dist-packages (from streamlit) (4.14.0)
Collecting watchdog<7,>=2.1.5 (from streamlit)
  Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl.metadata (44 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.3/44.3 kB 3.3 MB/s eta 0:00:00
Requirement already satisfied: gitpython!=3.1.19,<4,>=3.0.7 in /usr/local/lib/python3.11/dist-packages (from streamlit) (3.1.44)
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Requirement already satisfied: tornado<7,>=6.0.3 in /usr/local/lib/python3.11/dist-packages (from streamlit) (6.4.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from altair<6,>=4.0->streamlit) (3.1.6)
Requirement already satisfied: jsonschema>=3.0 in /usr/local/lib/python3.11/dist-packages (from altair<6,>=4.0->streamlit) (4.24.0)
Requirement already satisfied: narwhals>=1.14.2 in /usr/local/lib/python3.11/dist-packages (from altair<6,>=4.0->streamlit) (1.41.0)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.11/dist-packages (from gitpython!=3.1.19,<4,>=3.0.7->streamlit) (4.0.12)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.11/dist-packages (from pandas<3,>=1.4.0->streamlit) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas<3,>=1.4.0->streamlit) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas<3,>=1.4.0->streamlit) (2025.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.27->streamlit) (3.4.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.27->streamlit) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.27->streamlit) (2.4.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests<3,>=2.27->streamlit) (2025.4.26)
Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.11/dist-packages (from gitdb<5,>=4.0.1->gitpython!=3.1.19,<4,>=3.0.7->streamlit) (5.0.2)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2->altair<6,>=4.0->streamlit) (3.0.2)
Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit) (25.3.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit) (2025.4.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit) (0.36.2)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.11/dist-packages (from jsonschema>=3.0->altair<6,>=4.0->streamlit) (0.25.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.8.2->pandas<3,>=1.4.0->streamlit) (1.17.0)
Downloading streamlit-1.45.1-py3-none-any.whl (9.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.9/9.9 MB 39.9 MB/s eta 0:00:00
Downloading pydeck-0.9.1-py2.py3-none-any.whl (6.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.9/6.9 MB 49.3 MB/s eta 0:00:00
Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl (79 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.1/79.1 kB 8.6 MB/s eta 0:00:00
Installing collected packages: watchdog, pydeck, streamlit
Successfully installed pydeck-0.9.1 streamlit-1.45.1 watchdog-6.0.0

GUI CodeΒΆ

InΒ [Β ]:
import streamlit as st
import os
import cv2
import numpy as np
from PIL import Image
from detectron2.engine import DefaultPredictor
from detectron2.utils.visualizer import Visualizer
import pickle
import warnings
from detectron2.data import MetadataCatalog
from detectron2.utils.visualizer import ColorMode

# Suppress PyTorch warnings
warnings.filterwarnings("ignore", category=UserWarning, module="torch")

# Disable Streamlit's file watcher
# st.set_option("server.fileWatcherType", "none")

# Define class names manually (ensure the order matches the training dataset)
class_names = ['food-detection',
 'apple_pie',
 'chocolate_cake',
 'french_fries',
 'hot_dog',
 'ice_cream',
 'nachos',
 'onion_rings',
 'pancakes',
 'pizza',
 'ravioli',
 'samosa',
 'spring_rolls']

# Create metadata for visualization
metadata = MetadataCatalog.get("food_detection_metadata")
metadata.thing_classes = class_names

# Load the model configuration and weights
@st.cache_resource
def load_model(output_dir):
    # Log the model loading process
    print('Loading Food detection Model...')

    # Load configuration
    config_path = os.path.join(output_dir, "config.pkl")
    with open(config_path, "rb") as f:
        cfg = pickle.load(f)
    print(f"Model configuration loaded from {config_path}")

    # Load model weights
    model_weights_path = os.path.join(output_dir, "model_final.pth")
    cfg.MODEL.WEIGHTS = model_weights_path
    print(f"Model weights loaded from {model_weights_path}")

    # Force the model to run on the CPU
    cfg.MODEL.DEVICE = "cpu"

    # Create predictor
    predictor = DefaultPredictor(cfg)
    print('Loaded Food detection Model')
    return predictor, cfg

# Define paths
output_dir = "output_food_detection_mask_rcnn"

# Load the model once
predictor, cfg = load_model(output_dir)

# Streamlit UI
st.markdown("# 🍴 Food Detection Model")
st.markdown("Upload an image to detect food items, bounding boxes, and prediction details.")

# File uploader
uploaded_file = st.file_uploader("### πŸ“‚ Choose an image...", type=["jpg", "jpeg", "png"])

if uploaded_file is not None:
    # Convert the uploaded file to a temporary file path
    temp_file_path = os.path.join("temp_uploaded_image.jpg")
    with open(temp_file_path, "wb") as f:
        f.write(uploaded_file.getbuffer())

    # Load the image using cv2
    image = cv2.imread(temp_file_path)
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB for visualization

    # Run the model on the image
    outputs = predictor(image)

    # Extract prediction details
    instances = outputs["instances"].to("cpu")
    prediction_details = []
    if "instances" in outputs:
        for i in range(len(instances)):
            class_id = instances.pred_classes[i].item()
            class_name = class_names[class_id]  # Map class ID to class name
            score = instances.scores[i].item()
            prediction_details.append(f"{class_name}: {score:.2f}%")

    # Visualize the predictions with distinct bounding boxes
    visualizer = Visualizer(
        image_rgb[:, :, ::-1],  # Convert RGB to BGR for visualization
        metadata=metadata,
        scale=0.8,  # Adjust scale for better visualization
        instance_mode=ColorMode.IMAGE  # Keep the background in color
    )

    predicted_image = image.copy()

    # Customize bounding box thickness and color based on image background
    instances = outputs["instances"].to("cpu")
    for i in range(len(instances)):
        box = instances.pred_boxes[i].tensor.numpy()[0]  # Extract bounding box coordinates
        class_id = instances.pred_classes[i].item()
        class_name = class_names[class_id]
        score = instances.scores[i].item()

        # Calculate average brightness of the image to decide border color
        avg_brightness = np.mean(predicted_image)
        border_color = (0, 0, 0) if avg_brightness > 128 else (255, 255, 255)  # Black for bright images, white for dark images

        # Draw bounding box with dynamic color and thickness
        x1, y1, x2, y2 = map(int, box)
        cv2.rectangle(predicted_image, (x1, y1), (x2, y2), border_color, thickness=10)  # Dynamic border color with thickness 6

        # Add class name and confidence score as label with a background
        label = f"{class_name}: {score:.2f}"

        # Get the text size to calculate the rectangle dimensions
        (text_width, text_height), baseline = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, fontScale=3.5, thickness=16)

        # Define the rectangle coordinates
        rect_x1, rect_y1 = x1, y1 - text_height - 10  # Top-left corner of the rectangle
        rect_x2, rect_y2 = x1 + text_width, y1  # Bottom-right corner of the rectangle

        # Draw the rectangle (background) with a contrasting color
        background_color = (255, 255, 255) if border_color == (0, 0, 0) else (0, 0, 0)  # White for black borders, black for white borders
        cv2.rectangle(predicted_image, (rect_x1, rect_y1), (rect_x2, rect_y2), background_color, thickness=-1)  # Filled rectangle

        # Draw the text on top of the rectangle
        cv2.putText(predicted_image, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, fontScale=3.5, color=border_color, thickness=16)

    # Convert the image back to RGB for visualization
    predicted_image_rgb = predicted_image[:, :, ::-1]  # Reverse the color channels (BGR to RGB)
    # Create three columns for input, output, and prediction details
    col1, col2, col3 = st.columns([3, 3, 1.4])  # Adjust proportions: col1 and col2 are larger than col3

    # Display the original image in the first column
    with col1:
        st.markdown("### πŸ–ΌοΈ Uploaded Image")
        st.image(image_rgb, use_container_width=True)

    # Display the predicted image in the second column
    with col2:
        st.markdown("### πŸ“Š Predicted Output")
        # Convert the predicted image from BGR to RGB
        #predicted_image_rgb = vis.get_image()[:, :, ::-1]
        st.image(predicted_image_rgb, use_container_width=True)

    # Display the prediction details in the third column
    with col3:
        st.markdown("### πŸ”Details")
        for detail in prediction_details:
            # Split the class name and confidence score
            class_name, confidence = detail.split(":")
            # Remove the '%' symbol and convert confidence to percentage
            confidence_value = float(confidence.strip().replace('%', '')) * 100
            st.markdown(f"**:blue[{class_name.strip()}]**: {confidence_value:.2f}%")
2025-06-12 22:19:14.614 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:14.614 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:14.615 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
Loading Food detection Model...
2025-06-12 22:19:15.116 Thread 'Thread-10': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:15.118 Thread 'Thread-10': missing ScriptRunContext! This warning can be ignored when running in bare mode.
Model configuration loaded from /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/output_food_detection_mask_rcnn/config.pkl
Model weights loaded from /content/drive/MyDrive/shortcuts/Gr 6 CV3 - CapStone Project/Capstone_Project/output_food_detection_mask_rcnn/model_final.pth
2025-06-12 22:19:24.153 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.154 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.155 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.156 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.157 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.158 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.159 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.159 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.160 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.161 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
2025-06-12 22:19:24.162 Thread 'MainThread': missing ScriptRunContext! This warning can be ignored when running in bare mode.
Loaded Food detection Model

Sample Prediction in the GUIΒΆ

  • Screenshot below

image.png

Insights and RecommendationsΒΆ

Insights & Observations EfficientNet for Food Classification:

EfficientNet's compound scaling allows it to achieve high accuracy with fewer parameters, making it efficient for classification tasks involving large food datasets.

Shows strong performance in recognizing fine-grained food categories, especially when trained on well-annotated datasets like Food-101, UECFood100, or VFN.

Faster R-CNN vs. Mask R-CNN for Food Detection:

Both models yielded comparable precision and accuracy, with Mask R-CNN performing marginally better due to its additional segmentation capabilities.

Mask R-CNN's pixel-level precision helps disambiguate overlapping or occluded food items more effectively than Faster R-CNN.

Recommendations for Model Improvements

Model Enhancements:

Replace EfficientNet with CoAtNet or ConvNeXt:

These models outperform EfficientNet in fine-grained classification and may offer better generalization.

Use hybrid detectors like DETR or DINO:

Transformer-based models like DETR (DEtection TRansformer) and DINO offer state-of-the-art object detection performance with better contextual understanding and fewer hand-crafted components.

Instance Segmentation + Depth Estimation:

Integrating monocular depth estimation (e.g., MiDaS) with Mask R-CNN could improve detection in cluttered environments by incorporating spatial cues.

Training Improvements:

Augmentation strategies:

Use domain-specific augmentations like lighting variation, occlusion simulation, or synthetic plating for food datasets.

Semi-supervised learning:

Leverage unlabelled food images using pseudo-labelling or contrastive learning approaches (e.g., SimCLR, BYOL) to enhance performance with limited annotations.

Post-Processing & Metrics:

Implement non-maximum suppression (NMS) refinements like Soft-NMS or DIoU-NMS for better handling of overlapping predictions.

Evaluate not just with mAP but also IoU variance across classes, especially for fine-grained food items.

Industry Applications and Use Cases

Quick Service Restaurants (QSRs) & Cloud Kitchens:

Use Case 1: Real-time tray detection for automated quality checks or portion control.

Requirement: High accuracy and precision (Mask R-CNN recommended).

Smart Canteens & Automated Buffet Billing:

Use Case 2: Auto-identify food items on trays for billing without manual input.

Requirement: Moderate-to-high accuracy; real-time response favored (lightweight Mask R-CNN or YOLOv8-seg could be beneficial).

Diet Monitoring & Health Tech:

Use Case 3: Smartphone-based food logging apps with visual detection.

Requirement: Lightweight models with acceptable accuracy (MobileNet + YOLOv5 or YOLO-NAS).

Waste Management & Food Recognition for Sustainability:

Use Case 4: Segregation of leftover food types for recycling or composting.

Requirement: Emphasis on robustness under poor lighting or occlusion (Mask R-CNN or DETR with attention-based models).

Retail & Supermarkets:

Use Case 5: Self-checkout systems recognizing unpackaged fresh foods.

Requirement: High precision to avoid billing errors (DETR, DINO, Mask R-CNN).